ibm-nasa-geospatial
/

Prithvi-EO-1.0-100M

Inference Endpoints

Model card Files Files and versions Community

Paolo-Fraccaro commited on Jul 17, 2023

Commit

3557d80

•

1 Parent(s): 4076a1a

Update README.md

Files changed (1) hide show

README.md +27 -1

README.md CHANGED Viewed

@@ -4,4 +4,30 @@ tags:
 - Pytorch
 - Geospatial
 - Temporal ViT
----

 - Pytorch
 - Geospatial
 - Temporal ViT
+---
+This repository includes the foundation model architecture of Prithvi, a first-of-its-kind temporal Vision transformer pretrained by the IBM and NASA team on continental US Harmonised Landsat Sentinel 2 (HLS) data. This is contained in the `hls-gfm` folder, alongside all the relevant info on how to obtain the pre-trained weights through Hugging Face.
+This repo also contains a practical implementation of finetuning Prithvi to flood detection and fire scars detection as an example of a specific downstream application. See the `fine-tuning-example` folder for more details.
+### Model and Input
+The model expects remote sensing data in a video format (B, C, T, H, W). Note that the temporal dimension is very important here and not present in most
+other works around remote sensing modeling. Being able to handle a time series of remote sensing images can be very helpful to a variety of downstream tasks. The model can also handle static image which can be simply fed into the model with T=1.
+### Code
+The model follows [original mae repo](https://github.com/facebookresearch/mae) with modifications including:
+1. replace 2D patch embed with 3D patch embed
+2. replace 2D positional embed with 3D positional embed
+3. replace 2D patchify and unpatchify with 3D
+4. etc.
+### Pre-training
+The model was pre-trained with Harmonised Landsat and Sentinel 2 data from NASA using the following bands:
+* Blue
+* Green
+* Red
+* Narrow NIR
+* SWIR 1
+* SWIR 2