Paolo-Fraccaro
commited on
Commit
•
3557d80
1
Parent(s):
4076a1a
Update README.md
Browse files
README.md
CHANGED
@@ -4,4 +4,30 @@ tags:
|
|
4 |
- Pytorch
|
5 |
- Geospatial
|
6 |
- Temporal ViT
|
7 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
4 |
- Pytorch
|
5 |
- Geospatial
|
6 |
- Temporal ViT
|
7 |
+
---
|
8 |
+
|
9 |
+
This repository includes the foundation model architecture of Prithvi, a first-of-its-kind temporal Vision transformer pretrained by the IBM and NASA team on continental US Harmonised Landsat Sentinel 2 (HLS) data. This is contained in the `hls-gfm` folder, alongside all the relevant info on how to obtain the pre-trained weights through Hugging Face.
|
10 |
+
This repo also contains a practical implementation of finetuning Prithvi to flood detection and fire scars detection as an example of a specific downstream application. See the `fine-tuning-example` folder for more details.
|
11 |
+
|
12 |
+
|
13 |
+
### Model and Input
|
14 |
+
The model expects remote sensing data in a video format (B, C, T, H, W). Note that the temporal dimension is very important here and not present in most
|
15 |
+
other works around remote sensing modeling. Being able to handle a time series of remote sensing images can be very helpful to a variety of downstream tasks. The model can also handle static image which can be simply fed into the model with T=1.
|
16 |
+
|
17 |
+
### Code
|
18 |
+
The model follows [original mae repo](https://github.com/facebookresearch/mae) with modifications including:
|
19 |
+
1. replace 2D patch embed with 3D patch embed
|
20 |
+
2. replace 2D positional embed with 3D positional embed
|
21 |
+
3. replace 2D patchify and unpatchify with 3D
|
22 |
+
4. etc.
|
23 |
+
|
24 |
+
### Pre-training
|
25 |
+
The model was pre-trained with Harmonised Landsat and Sentinel 2 data from NASA using the following bands:
|
26 |
+
|
27 |
+
* Blue
|
28 |
+
* Green
|
29 |
+
* Red
|
30 |
+
* Narrow NIR
|
31 |
+
* SWIR 1
|
32 |
+
* SWIR 2
|
33 |
+
|