cholmes commited on
Commit
03769da
1 Parent(s): 0ca3401

added potential questions / status

Browse files
Files changed (1) hide show
  1. questions.md +135 -0
questions.md ADDED
@@ -0,0 +1,135 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Core questions (no hcat)
2
+
3
+ * What are the average field sizes of each country in the baltics?
4
+ * How many fields are there in Latvia?
5
+ * What's the total area of devoted to agriculture in Latvia?
6
+ * Is the average field size larger in lithuania or latvia?
7
+ * How many fields are there that are under 1 hectare?
8
+ * Show a map with the 10 largestfields
9
+ * What percent of fields are under 2 hectares?
10
+ * Show a map with the largest field in Estonia
11
+ * Show a map with the ten largest fields
12
+ * Which country has the most area covered by fields?
13
+ * what is the average field size of the largest 10 percent of fields?
14
+ * what percent of fields are over 20 hectares?
15
+ * how big on average are the largest 20% of fields?
16
+ * can you print me a table that calculates deciles of field area?
17
+ * can you print me a table that shows the average field area by decile?
18
+
19
+
20
+
21
+
22
+
23
+ ## More coding / prompting needed
24
+
25
+ ### Maps with more
26
+ * Show a map with the largest fields
27
+ *seems to have a limit of 100, but nothing shows up*
28
+
29
+ ### Percent of total area
30
+ * What precent of Latvia is used for agriculture?
31
+ (just need to put in total area of latvia, etc. somewhere)
32
+
33
+
34
+ ### Quantiles / deciles
35
+ * Can you make a table with quantiles / deciles of the field sizes?
36
+ ```
37
+ SELECT NTILE(10) OVER (ORDER BY area) AS decile, MIN(area) AS min_size, MAX(area) AS max_size, AVG(area) AS avg_size FROM crops GROUP BY decile ORDER BY decile;
38
+ ```
39
+ *GROUP BY clause cannot contain window functions!*
40
+ - need to teach the right duckdb calculation for this.
41
+
42
+ ### Graphs / charts
43
+ * Show a chart of field size by decile/quantile
44
+ * Show a chart of field size by decile, with the most common crop for that decile (hcat)
45
+
46
+ ### Admin 2 level questions
47
+ - Need to pre-process admin 2 names for each row.
48
+ * How many fields are there in each county of Estonia? 
49
+ * What state/county has the highest percent of its land as agriculture? 
50
+
51
+
52
+ # hcat / crop questions
53
+
54
+ * Show a map with the ten largest sugar beet fields
55
+ * What are the top ten crops by area for Lithuania?
56
+ * What are the top ten crops by number of fields for Lithuania?
57
+ * What are the top ten crops that have a field size over 10 hectares in the baltics?
58
+ - sometimes gets this TODO: teach manual sum of rows for 'number of fields' and field count', 'what are the most common crops'
59
+
60
+ * What is the percent of wheat in the baltics?
61
+ * what percent of latvia agricultural area is corn?
62
+ * what is the average field size of corn in latvia?
63
+ * what crop has the smallest average field size in latvia?
64
+ * what are the ten crops with the smallest average field sizes in latvia?
65
+ * what are the ten crops with the smallest average field sizes (with at least 20 fields) in latvia?
66
+ * What percent of latvia is strawberries?
67
+ * what are the ten crops with the largest average field sizes (with at least 20 fields) in latvia?
68
+ * how many fields plant vetches in Latvia?
69
+ * how many fields of corn are there in each of the baltic states?
70
+ * what's the total area of corn in each of the baltic states?
71
+ * what percent of lithuania is corn?
72
+ * what is the average field size for wheat in the baltics?
73
+ * What is the most common crop on fields over 5 hectares?
74
+ * what is the most common crop on fields over 10 hectares in estonia?
75
+ * What percent of sugar beet fields are over 10 hectares?
76
+ - 45.74898785425101
77
+ * what are the ten most common crops by number of fields?
78
+ * What are the top 5 flowers in the baltics?
79
+ * what are the top 5 legumes by field count in the baltics?
80
+ * What are the top 5 legumes in the baltics?
81
+ * what are the average field sizes of peas in the baltics?
82
+ * what are the average field sizes of beans in the baltics?
83
+ * what percent of estonia is not fallow or pasture?
84
+ * What percent of latvia is fallow?
85
+ * what are the sizes of strawberry fields by quantile in latvia?
86
+ * what are the sizes of wheat fields by quantile in latvia?
87
+
88
+
89
+
90
+
91
+
92
+
93
+ ## More coding / prompting needed
94
+
95
+
96
+ - *this is using the percent of the country, not the percent of the fields*
97
+ * what is the percent of wheat in each country?
98
+ - `SELECT collection, SUM(area) / 4346727 * 100 AS percent_wheat FROM crops WHERE crop_type IN ('common_soft_wheat', 'durum_hard_wheat') GROUP BY collection;` - should use percent of the country, not the total area. Also this doesn't return any results.
99
+ * What are the average field sizes of the top ten crops by area?
100
+ - Tried to teach it about not using windowed functions like row_number(), but seems like we need to explicitly train it for this type of query like we did for quantiles.
101
+ * Which country has the largest area of arable crops (crop code starts with 3301)?
102
+ * Which country has the largest area of grassland (crop code starts with 3302)?
103
+ * Which country has the largest area of Permanent perenniel crops (crop code starts with 3303)?
104
+ What is the most unique crop type for each country?
105
+ - returned the values with the crop_type for each one.
106
+ * What percent of sugar in the baltics is sugar beet?
107
+
108
+ # Ideas needing more coding
109
+
110
+ - show a chart or a graph
111
+ - Natural language processing of responses, particularly when there's only one result.
112
+ - decide on what to use of map, table, graph, answer
113
+ - give more common names in output - ie clean up some of the weird quirks of eurocrops.- multi-step analysis, like get complex results from each country and then compare / analyze- pre-process country level stats
114
+ - total area, total perimeter
115
+ - by crop stats - total area, total percent of fields, total percent of overall land
116
+ - add admin 2 (state) level attribute to each field
117
+ - update the table when starting up? Save the new table?
118
+ - do admin 2 level stats like country level ones- geospatial queries (should wait for duckdb 1.1 support)
119
+ - like bounding box / polygon- joins with environmental data, etc.
120
+
121
+
122
+
123
+
124
+
125
+
126
+
127
+
128
+
129
+
130
+
131
+
132
+
133
+
134
+
135
+