Publishing data on the World Air Quality Index project is free for every one. But it is also simple and straightforward: All you need to add data on the World Air Quality Index map is to provide a feed URL with the real-time information about the monitoring stations (name and location), the pollutants being measured and the real-time readings (together with the units, for instance milligrams or ppb).
Qualification Criteria
Note that the data published on the World Air Quality Index map is only official data provided by each country respective Environmental Protection Agency (see the full EPA list). Those official data are monitored using professional BAM and TEOM-like Air Quality monitoring stations.
Those stations are not trivial, and do require constant maintenance and calibration by the the EPA professional field-engineering teams. Therefore, except from specific conditions (e.g. if there no coverage in a country or a given region / city), the World Air Quality Index project does not accept any form of data generated from non-professional Air Quality monitors.
If your contry or city does not have official monitoring, it is acceptable to report data from more affordable instant air particule counters (eg PMS, SDS,..). While we do recommend to use the semi-professional GAIA air quality monitoring stations for this purpose, we however accept data from other stations. Nevertheless, such particule counter-based stations need to report enhanced data feed with additional quality controls (see quality control section).
Feed format
You can also check, for reference, the official feeds for Singapore, Peru or Netherlands and notice that they are all different.
Data ingestion
The World Air Quality Index system will take care of regularly checking the data from the feed, and each time an update is available, it will be processed, converted to the US EPA scale AQI values, and publish on the World Air Quality Index website within minutes.
Also, although only PM2.5, PM10, Ozone, NO2, SO2 and CO Air Quality data is published, the system does collect more pollutants for forecasting purpose: Benzen, Toluen, Ethylbenzen, NOx, THC, NMHC, PM1, Formaldehyde, Mercury, Ammonia, Methan, Hydrogen sulfide, Nitrous acid, Phenol, Naphthalene, paraxylene (p-Xylen), metaxylene (m-Xylen), etc..
It is also possible to publish meteorological data: Temperature, Atmospheric Pressure, Humidity, Precipitation, Wind Speed, Wind Direction, Solar Radiation and UVI. If not provided, we will use other relevant meteorologic information sources.
Feed Example (CSV format)
#ID: ID_BEI_DC
#City: Beijing
#Station: Dongcheng
#Name: 东城东四
#Latitude: 39.929
#Longitude: 116.417
#Timezone: +0800
Date,PM10,PM25,CO,Ozone,Sulphur Dioxide,Nitrogen Dioxide,AmbientTemperature,RelativeHumidity,WindDirection,WindSpeed,Pressure,RainGauge
Unit,ug/m3,ug/m3,ppm,µg/m3,µg/m3,µg/m3,°C,%,°,m/s,hPa,mm
10/29/2016 13:00,16,3,,58,10,3,32,66,200,3,1001,0
10/29/2016 14:00,19,8,,57,9,4,32,64,197,2,1001,0
10/29/2016 15:00,15,9,,52,47,17,30,72,190,2,1001,0
10/29/2016 16:00,31,19,,52,34,17,30,75,191,2,1001,0
10/29/2016 17:00,31,17,,49,49,19,29,75,194,1,1002,0
10/29/2016 18:00,37,18,,45,55,25,29,73,183,1,1003,0
10/29/2016 19:00,24,13,,40,21,19,29,80,65,1,1004,0
10/29/2016 20:00,39,22,,44,4,16,28,85,7,1,1005,0
10/29/2016 21:00,24,16,,43,3,7,28,85,10,1,1005,0
Feed Example (HTML format)
Station ID | City or County Name | Station Name | Local name (optional) | Latitude/Longitude | Timezone (optional) |
---|---|---|---|---|---|
ID_BEI_DC | Beijing | Dongcheng | 东城东四 | 39.929/116.417 | +0800 |
ID_BEI_WP | Beijing | West Park | 西城官园 | 39.929/116.339 | +0800 |
ID_BEI_OP | Beijing | Olympic Park | 朝阳奥体中心 | 39.982/116.397 | +0800 |
... | ... | ... | ... | ... | ... |
- The "Station ID" is the unique identifer for each station, and it can just be a number (eg ID8373), or the concatenation of station city and station name (eg. "Beijing/Dongcheng").
- By default, the station will be available via the url /city/country-name/city-name/station-name.
- The "Station Name" must use latin characters, so the optional "Local Name" can be provided to localize the webpage.
Real-time pollutant list:
Station ID | Pollutant | Unit | Update time | Value | Averaging |
---|---|---|---|---|---|
ID_BEI_DC | PM10 | mg/m3 | 2019/06/05 17:00:00 | 27.8 | 1 hour |
ID_BEI_DC | PM25 | mg/m3 | 2019/06/05 17:00:00 | 10.8 | 1 hour |
ID_BEI_DC | Ozone | mg/m3 | 2019/06/05 17:00:00 | 15.2 | 1 hour |
ID_BEI_DC | Ozone | mg/m3 | 2019/06/05 17:00:00 | 18.2 | 8 hours |
ID_BEI_DC | Temperature | Celcius | 2019/06/05 17:00:00 | 22.3 | 1 hour |
ID_BEI_WP | PM10 | mg/m3 | 2019/06/05 17:00:00 | 27.8 | 1 hour |
ID_BEI_WP | PM25 | mg/m3 | 2019/06/05 17:00:00 | 10.8 | 1 hour |
ID_BEI_WP | SO2 | ppb | 2019/06/05 17:00:00 | 15.2 | 1 hour |
ID_BEI_WP | Humidiy | % | 2019/06/05 17:00:00 | 88 | 1 hour |
... | ... | ... | ... | ... |
- The "Averaging" column is use to specify the duration of the value. The most common averaging is 1 hour. It is also the prefered one, as our backend system will automatically do the 8 hour averagin computation for the Ozone and Carbon Monoxyde.
- In the case the readings are provided more freqently than every hour (for instance every 30 minutes or 10 minutes), you can either provide the raw readings for the given period, or just the hourly averaging: Our back-end system will anyway process data even between the hour.
Feed Example (JSON format)
Quality Control for particule counter sensors
The use of averaging is however not good enough, especially for failing sensors (or sensor close of end-of-life). Therefore, for such sensors, it is required to provide additonal metrics, such as the median, min, max, and standard-deviation. See for example the
readings
object in the below JSON data feed. You can use the following arduino-compatible code to collect those metrics:..
class Accumulator
{
#define OUTPUT_BUFFER_SIZE 120
char buffer[OUTPUT_BUFFER_SIZE];
#define MAXACCVALUES 120
int vals[MAXACCVALUES];
int count = 0;
public:
Accumulator()
{
reset();
}
void reset()
{
count = 0;
for (int i=0;i< MAXACCVALUES;i++)
{
vals[i]=0;
}
}
bool hasData()
{
return count!=0;
}
std::string output()
{
if (!hasData()) return std::string("{}");
sprintf(buffer, OUTPUT_BUFFER_SIZE, "{\"min\":%d,\"max\":%d,\"median\":%d,\"average\":%.1f,\"stddev\":%.1f,\"count\":%d}",
vmin(), vmax(), median(), average(), stddev(), count);
return std::string(buffer);
}
void add(int val)
{
if (count==MAXACCVALUES-1) {
for (int i = 0; i < MAXACCVALUES - 1; i++) {
vals[i] = vals[i + 1];
}
count --;
}
vals[count++]=val;
}
float stddev()
{
if (!hasData()) return -1;
int u =avg();
int t = 0;
for (int i=0;ivals[j]) {
float t = vals[j];
vals[j]=vals[i];
vals[i]=t;
}
}
}
return vals[count/2];
}
float avg()
{
if (!hasData()) return -1;
float t = 0;
for (int i=0;ivals[i]) {
t=vals[i];
}
}
return t;
}
float vmax()
{
if (!hasData()) return -1;
float t = vals[0];
for (int i=0;it) {
t=vals[i];
}
}
return t;
}
};