File size: 13,122 Bytes
4de2b91 00d327a e8c4ed3 2809e52 6bdd205 3c9f729 effa24f 2809e52 3c9f729 4de2b91 effa24f 80c3093 3c9f729 87e016b 3c9f729 c00ce89 3c9f729 87e016b 00d327a 3c9f729 87e016b 2809e52 80c3093 ac55c81 80c3093 3c9f729 80c3093 3c9f729 df5c49a 3c9f729 2809e52 3c9f729 2809e52 80c3093 ac55c81 80c3093 e8c4ed3 2809e52 e1b3138 3c9f729 fc25688 2809e52 afc4bea b3ecd8f 2809e52 3c9f729 2809e52 3c9f729 2809e52 3c9f729 e37da91 2809e52 3c9f729 b3ecd8f 2809e52 b3ecd8f 96bb15d e37da91 2809e52 effa24f 4de2b91 2809e52 2b460af 2809e52 f87e0cc e3b6615 f87e0cc 2809e52 80c3093 3c9f729 80c3093 2809e52 3c9f729 2809e52 96bb15d 2809e52 3c9f729 febb4b8 3c9f729 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 |
# Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis Implementation
[![Open in Streamlit](https://static.streamlit.io/badges/streamlit_badge_black_white.svg)](https://huggingface.co/spaces/flax-community/DietNerf-Demo) [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1etYeMTntw5mh3FvJv4Ubb7XUoTtt5J9G?usp=sharing)
<p align="center"><img width="450" alt="스크린샷 2021-07-04 오후 4 11 51" src="https://user-images.githubusercontent.com/77657524/126361638-4aad58e8-4efb-4fc5-bf78-f53d03799e1e.png"></p>
Welcome to Putting NeRF on a Diet Project!
This project is the Pytorch, JAX/Flax based code implementation of this paper [Putting NeRF on a Diet : Ajay Jain, Matthew Tancik, Pieter Abbeel, Arxiv : https://arxiv.org/abs/2104.00677]
The model generates the novel view synthesis redering (NeRF: Neural Radiances Field) with Fewshot learning scheme.
The semantic loss use the pre-trained CLIP Vision Transformer embedding. This information can give a 2D supervision for 3D.
The Diet NeRF result outperforms the original NeRF in 3D reconstruction and neural rendering with only few images.
## 🤗 Hugging Face Hub Repo URL:
We will also upload our project on the Hugging Face Hub Repository Also.
[https://huggingface.co/flax-community/putting-nerf-on-a-diet/](https://huggingface.co/flax-community/putting-nerf-on-a-diet/)
Our JAX/Flax implementation currently supports:
<table class="tg">
<thead>
<tr>
<th class="tg-0lax"><span style="font-weight:bold">Platform</span></th>
<th class="tg-0lax" colspan="2"><span style="font-weight:bold">Single-Host GPU</span></th>
<th class="tg-0lax" colspan="2"><span style="font-weight:bold">Multi-Device TPU</span></th>
</tr>
</thead>
<tbody>
<tr>
<td class="tg-0lax"><span style="font-weight:bold">Type</span></td>
<td class="tg-0lax">Single-Device</td>
<td class="tg-0lax">Multi-Device</td>
<td class="tg-0lax">Single-Host</td>
<td class="tg-0lax">Multi-Host</td>
</tr>
<tr>
<td class="tg-0lax"><span style="font-weight:bold">Training</span></td>
<td class="tg-0lax"><img src="http://storage.googleapis.com/gresearch/jaxnerf/check.png" alt="Supported" width=18px height=18px></td>
<td class="tg-0lax"><img src="http://storage.googleapis.com/gresearch/jaxnerf/check.png" alt="Supported" width=18px height=18px></td>
<td class="tg-0lax"><img src="http://storage.googleapis.com/gresearch/jaxnerf/check.png" alt="Supported" width=18px height=18px></td>
<td class="tg-0lax"><img src="http://storage.googleapis.com/gresearch/jaxnerf/check.png" alt="Supported" width=18px height=18px></td>
</tr>
<tr>
<td class="tg-0lax"><span style="font-weight:bold">Evaluation</span></td>
<td class="tg-0lax"><img src="http://storage.googleapis.com/gresearch/jaxnerf/check.png" alt="Supported" width=18px height=18px></td>
<td class="tg-0lax"><img src="http://storage.googleapis.com/gresearch/jaxnerf/check.png" alt="Supported" width=18px height=18px></td>
<td class="tg-0lax"><img src="http://storage.googleapis.com/gresearch/jaxnerf/check.png" alt="Supported" width=18px height=18px></td>
<td class="tg-0lax"><img src="http://storage.googleapis.com/gresearch/jaxnerf/check.png" alt="Supported" width=18px height=18px></td>
</tr>
</tbody>
</table>
## 🤩 Demo
- Streamlit Space Demo
You can check our Streamlit Space demo on following site !
With any input camera pose, we can render the novel view synthesis.
[https://huggingface.co/spaces/flax-community/DietNerf-Demo](https://huggingface.co/spaces/flax-community/DietNerf-Demo)
- Colab Demo
Moreover, we prapare the colab ipython notebook for you.
You need colab pro account for running our model on the colab(For memory issue)
[https://colab.research.google.com/drive/1etYeMTntw5mh3FvJv4Ubb7XUoTtt5J9G?usp=sharing
](https://colab.research.google.com/drive/1etYeMTntw5mh3FvJv4Ubb7XUoTtt5J9G?usp=sharing
)
## 💻 Installation
```bash
# Clone the repo
svn export https://github.com/google-research/google-research/trunk/jaxnerf
# Create a conda environment, note you can use python 3.6-3.8 as
# one of the dependencies (TensorFlow) hasn't supported python 3.9 yet.
conda create --name jaxnerf python=3.6.12; conda activate jaxnerf
# Prepare pip
conda install pip; pip install --upgrade pip
# Install requirements
pip install -r jaxnerf/requirements.txt
# [Optional] Install GPU and TPU support for Jax
# Remember to change cuda101 to your CUDA version, e.g. cuda110 for CUDA 11.0.
pip install --upgrade jax jaxlib==0.1.57+cuda101 -f https://storage.googleapis.com/jax-releases/jax_releases.html
# install flax and flax-transformer
pip install flax transformer[flax]
```
## ⚽ Dataset
Download the datasets from the [NeRF official Google Drive](https://drive.google.com/drive/folders/128yBriW1IG_3NJ5Rp7APSTZsJqdJdfc1).
Please download the `nerf_synthetic.zip` and unzip them
in the place you like. Let's assume they are placed under `/tmp/jaxnerf/data/`.
## 💖 Methods
You can check more detail explaination about DietNeRF on following **Notion Report**
* 👉👉 VEEEERY Detail DietNeRF Explaination Docs : https://www.notion.so/DietNeRF-Putting-NeRF-on-a-Diet-4aeddae95d054f1d91686f02bdb74745
<p align="center"><img width="400" alt="스크린샷 2021-07-04 오후 4 11 51" src="https://user-images.githubusercontent.com/77657524/124376591-b312b780-dce2-11eb-80ad-9129d6f5eedb.png"></p>
Based on the principle
that “a bulldozer is a bulldozer from any perspective”, Our proposed DietNeRF supervises the radiance field from arbitrary poses
(DietNeRF cameras). This is possible because we compute a semantic consistency loss in a feature space capturing high-level
scene attributes, not in pixel space. We extract semantic representations of renderings using the CLIP Vision Transformer, then
maximize similarity with representations of ground-truth views. In
effect, we use prior knowledge about scene semantics learned by
single-view 2D image encoders to constrain a 3D representation.
You can check detail information on the author's paper. Also, you can check the CLIP based semantic loss structure on the following image.
<p align="center"><img width="600" alt="스크린샷 2021-07-04 오후 4 11 51" src="https://user-images.githubusercontent.com/77657524/126386709-a4ce7ff8-2a68-442f-b4ed-26971fb90e51.png"></p>
Our code used JAX/FLAX framework for implementation. So that it can achieve much speed up than other NeRF code. Moreover, we implemented multiple GPU distribution ray code. it helps much smaller training time. At last, our code used hugging face, transformer, CLIP model library.
## 🤟 How to use
```
python -m train \
--data_dir=/PATH/TO/YOUR/SCENE/DATA \ % e.g., nerf_synthetic/lego
--train_dir=/PATH/TO/THE/PLACE/YOU/WANT/TO/SAVE/CHECKPOINTS \
--config=configs/CONFIG_YOU_LIKE
```
You can toggle the semantic loss by “use_semantic_loss” in configuration files.
## 💎 Expriment Result
### ❗ Rendered Rendering images by 8-shot learned Diet-NeRF
DietNeRF has a strong capacity to generalise on novel and challenging views with EXTREMELY SMALL TRAINING SAMPLES!
### CHAIR / HOTDOG / DRUM / LEGO / MIC
<img alt="" src="https://user-images.githubusercontent.com/77657524/126913354-57c12c14-d550-4061-b745-f025f73b369b.png" width="250"/><img alt="" src="https://user-images.githubusercontent.com/77657524/126913363-8e0d9192-d02e-43c8-b29f-df54e09fab28.png" width="250"/></td><td><img alt="" src="https://user-images.githubusercontent.com/77657524/126913383-0a8b50df-da81-46b2-baac-2de5f20a7621.png" width="250"/>
<img alt="" src="https://user-images.githubusercontent.com/77657524/126913553-19ebd2f2-c5f1-4332-a253-950e41cb5229.gif" width="300"/><img alt="" src="https://user-images.githubusercontent.com/77657524/126913559-dfce4b88-84a8-4a0a-91eb-ed12716ab328.gif" width="300"/>
### ❗ Rendered GIF by occluded 14-shot learned NeRF and Diet-NeRF
We made aritificial occulusion on the right side of image (Only picked left side training poses).
The reconstruction quality can be compared with this experiment.
Diet NeRF shows better quailty than Original NeRF when It is occulused.
#### Training poses
<img width="1400" src="https://user-images.githubusercontent.com/26036843/126111980-4f332c87-a7f0-42e0-a355-8e77621bbca4.png">
#### LEGO
[DietNeRF]
<img alt="" src="https://user-images.githubusercontent.com/77657524/126913404-800777f8-8f88-451a-92de-3dda25075206.gif" width="300"/>
[NeRF]
<img alt="" src="https://user-images.githubusercontent.com/77657524/126913412-f10dfb3e-e918-4ff4-aa2c-63529fec91d8.gif" width="300"/>
#### SHIP
[DietNeRF]
<img alt="" src="https://user-images.githubusercontent.com/77657524/126913430-0014a904-6ca1-4a7b-9cd6-6f73b36552fb.gif" width="300"/>
[NeRF]
<img alt="" src="https://user-images.githubusercontent.com/77657524/126913439-2e3128ef-c7ef-4c21-8261-6e3b8fe51f86.gif" width="300"/>
## 👨👧👦 Our Teams
| Teams | Members |
|------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Project Managing | [Stella Yang](https://github.com/codestella) To Watch Our Project Progress, Please Check [Our Project Notion](https://www.notion.so/Putting-NeRF-on-a-Diet-e0caecea0c2b40c3996c83205baf870d) |
| NeRF Team | [Stella Yang](https://github.com/codestella), [Alex Lau](https://github.com/riven314), [Seunghyun Lee](https://github.com/sseung0703), [Hyunkyu Kim](https://github.com/minus31), [Haswanth Aekula](https://github.com/hassiahk), [JaeYoung Chung](https://github.com/robot0321) |
| CLIP Team | [Seunghyun Lee](https://github.com/sseung0703), [Sasikanth Kotti](https://github.com/ksasi), [Khali Sifullah](https://github.com/khalidsaifullaah) , [Sunghyun Kim](https://github.com/MrBananaHuman) |
| Cloud TPU Team | [Alex Lau](https://github.com/riven314), [Aswin Pyakurel](https://github.com/masapasa) , [JaeYoung Chung](https://github.com/robot0321), [Sunghyun Kim](https://github.com/MrBananaHuman) |
* Extremely Don't Sleep Contributors 🤣 : [Seunghyun Lee](https://github.com/sseung0703), [Alex Lau](https://github.com/riven314), [Stella Yang](https://github.com/codestella), [Haswanth Aekula](https://github.com/hassiahk)
# 😎 What we improved from original JAX-NeRF : Innovation
- Neural rendering with fewshot images
- Hugging face CLIP based semantic loss loop
- You can choose coarse mlp / coarse + fine mlp training
(coarse + fine is on the `main` branch / coarse is on the `coarse_only` branch)
* coarse + fine : shows good geometric reconstruction
* coarse : shows good PSNR/SSIM result
- Make Video/GIF rendering result, `--generate_gif_only` arg can run fast rendering GIF.
- Cleaning / refactoring the code
- Made multiple models / colab / space for Nice demo
# 💞 Social Impact
- Game Industry
- Augmented Reality Industry
- Virtual Reality Industry
- Graphics Industry
- Online shopping
- Metaverse
- Digital Twin
- Mapping / SLAM
## 🌱 References
This project is based on “JAX-NeRF”.
```
@software{jaxnerf2020github,
author = {Boyang Deng and Jonathan T. Barron and Pratul P. Srinivasan},
title = {{JaxNeRF}: an efficient {JAX} implementation of {NeRF}},
url = {https://github.com/google-research/google-research/tree/master/jaxnerf},
version = {0.0},
year = {2020},
}
```
This project is based on “Putting NeRF on a Diet”.
```
@misc{jain2021putting,
title={Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis},
author={Ajay Jain and Matthew Tancik and Pieter Abbeel},
year={2021},
eprint={2104.00677},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
## 🔑 License
[Apache License 2.0](https://github.com/codestella/putting-nerf-on-a-diet/blob/main/LICENSE)
## ❤️ Special Thanks
Our Project is started in the HuggingFace X GoogleAI (JAX) Community Week Event.
https://discuss.huggingface.co/t/open-to-the-community-community-week-using-jax-flax-for-nlp-cv/7104
Thank you for our mentor Suraj and organizers in JAX/Flax Community Week!
Our team grows up with this community learning experience. It was wonderful time!
<img width="250" alt="스크린샷 2021-07-04 오후 4 11 51" src="https://user-images.githubusercontent.com/77657524/126369170-5664076c-ac99-4157-bc53-b91dfb7ed7e1.jpeg">
Common Computer AI(https://comcom.ai/ko/) sponsored the multiple V100 GPUs for our project!
Thank you so much for your support!
<img width="250" alt="스크린샷" src="https://user-images.githubusercontent.com/77657524/126914984-d959be06-19f4-4228-8d3a-a855396b2c3f.jpeg">
|