Sunday, February 21, 2016

The magic of SQL queries on my flight data collection....

If you have read my other post you know that I am logging all received ADS-B "squirts" into a mySQL database. On average I collect around 50.000 complete entries per day (with callsign, altitude, speed, position, track, date/time).

I can retrieve all data with a simple "select * from flights;"

Table Structure & Data
After one week of data collection I have collected 400.000 entries in my database.
Now let's explore this data and learn some SQL query language at the same time.

How many Aircraft and Flights each day?

Query:
SELECT date, count(distinct hexident), count(distinct sign) FROM flights GROUP BY date;

Result:
Unique/Distinct Aircraft and Flights
You can clearly see that Aircraft are serving multiple flights each day - e.g. arriving with Lufthansa 001 and departing with Lufthansa 002.

Which Aircraft are racking up most flights?

Query:
Select distinct hexident,count(distinct sign) from flights
group by hexident order by count(distinct sign);
Result:
Number of flights per Aircraft HexID

Who is running these busy planes? 

Apparently it's German Wings (GWI):

Query:
select distinct sign, hexident from flights
where hexident like '3C5EF%' order by hexident;

German Wings keeps its planes busy...

The fastest/slowest/highest/lowest each day?

Let's focus on the the highes/lowest/fastest/slowest flights with
Query:
Select date, count(*), count(distinct hexident), count(distinct sign),
max(alti), min(alti), max(speed), min(speed) from flights group by date;
Result:
Highest/Lowest/Fastest/Slowest

And I can look for the specifics:

  • Flights below 0 feet altitude? Just a few blips from nearby Airbus plant Finkenwerder...pilots are either diving in the river or have not set proper altitude...
  • Flights above 44.000 feet altitude? AOJ87D a Swiss Jet? , FPG111 - a Dassault Falcon? , MMD6424 - another Dassault Falcon? It must be nice to look down from your own private jet.
  • Flights slower than 100 (knots?) - Who is CHX26? Looks like I tracked a helicopter?
  • Flights faster than 600 (knots?) - Looks like aircraft passing by (Air India, Air Canada...)

The furthest away?

Let's look for the flights that are furthest away... with this query:
select date, count(*), min(lat), max(lat), min(lon), max(lon) from flights group by date;

Furthest Flights
I could do some math (difference in Lat/Long to my position), but I am trying to keep it simple.


What interesting SQL queries can you think of?

If you have some ideas let me know...you can also have access to the mySQL data and experiment yourself...






















Sunday, February 14, 2016

ADS-B Data Visualization (separate by Speed/Altitude)

Just playing around with my ADS-B data that is collected in a mySQL database.
Competing a little with someone in Hannover doing the same... http://blog.wenzlaff.de/

Unfortunately cartodb.com does not allow direct mySQL data feeding, so I have to export to CSV first and then upload. Oh well....

Find below my latest visualizations with cartodb:

All Positions (unfiltered)

This picture nicely shows the reception area of my indoor antenna.
The spike to the Left/West is the windows of my office...
Unfiltered data...a bit busy...

High Altitude = Just passing over....

High Altitude Flights Only (> 12.000 feet)

Low Altitude = Departing or Landing

Low Altitude Flights Only (<= 10.000 feet)

Groundspeed (low speed=green=landing/departing)

I like this visualization....the ground speed clearly shows the flights departing/landing...while the overflights are at higher speed.
Groundspeed (slow=green, fast=red)

More?

More ideas...to experiment:
  • create KML files for Google Earth with flight-tracks from connected positions
  • create 3D rendering in Blender with the data
  • daily status reports (highest/lowest/fastest/slowest/furthest)

Tuesday, February 9, 2016

ADS-B Aircraft Data Collection & Aggregation

Understanding and aggregating ADSB message data

After installing DUMP1090 I was looking for ways to collect the position data for later analysis.
You can easily display the dump1090 data feed with netcat: nc –d localhost 30003. And of course you can simply run netcat in the background and write the data to file with  nc -d localhost 30003 >> adsb.csv
raw data from DUMP1090
This “base station format” is well explained here http://www.airnavsystems.com/forum/index.php?topic=2896.0 You will quickly realize this raw data needs to aggregated, as different message types send different data – but never at the same time.

ADS-B data before aggregation

MY SOLUTION? A LITTLE SHELL SCRIPT THAT… 

  • continously pipes the netcat raw data through a simple parser (IFS):
    nc -d $ADSBhost $ADSBport | while IFS="," read -r f1 f2 f3 f4 f5 f6 f7 f8 f9 f10 f11 f12 f13 f14 f15 f16 f17 f18 f19 f20 f21 f22
    do #loop until a break is thrown
  • lets me read each individual value in each individual message:
    #echo "Field 05 HexIdent         :$f5"
    #echo "Field 07 Date message gen :$f7"
    #echo "Field 08 Time message gen :$f8"
    #echo "Field 11 Callsign         :$f11"
    #echo "Field 12 Altitude         :$f12"
    #echo "Field 13 GroundSpeed      :$f13"
    #echo "Field 14 Track            :$f14"
    #echo "Field 15 Latitude         :$f15"
    #echo "Field 16 Longitude        :$f16"
    #echo "Field 17 Vertical Rate    :$f17"
  • Saves the relevant data into an array
    ident=$((0x${f5})) #convert Aircraft hex ID to an number
    if [ "$f11" != "" ];  then arr_call[ident]="$f11"; fi
    if [ "$f12" != "" ];  then arr_alti[ident]="$f12"; fi
    if [ "$f13" != "" ];  then arr_velo[ident]="$f13"; fi
    if [ "$f14" != "" ];  then arr_trck[ident]="$f14"; fi
    if [ "$f17" != "" ];  then arr_vert[ident]="$f17"; fi
  • And finally write the data only when both position and callsign are known
    #if position and if callsign is broadcast
    if [ "$f15" != "" ]; then   #if f15 Latitude not empty
    if [ "${arr_call[ident]}" != "" ]; then #if callsign is already known
    MapPoint="$counter,$f5,$f7,$f8,$f15,$f16,${arr_call[ident]},${arr_alti[ident]}"
    MapPoint="$punkt,${arr_velo[ident]},${arr_trck[ident]},${arr_vert[ident]}"
    echo "$ MapPoint" >> "$ADSBlog"      #write to log file

The Result? 

Now, instead of the original dump1090 messages like this: 
MSG,7,111,11111,47873D,111111,2016/02/09,21:16:09.550,2016/02/09,21:16:09.533,,36000,,,,,,,,,,0
MSG,5,111,11111,47873D,111111,2016/02/09,21:16:10.983,2016/02/09,21:16:10.974,,35975,,,,,,,0,,0,0

I have a complete data set like this:
MsgCount,HexIdent,Date,Time,Lat,Long,Callsign,Altitude,Speed,Track,Vertical
250,47875D,2016/02/09,21:12:35.282,53.53898,8.41868,IBK5411 ,38000,418,225,0
365,47875D,2016/02/09,21:12:53.369,53.51395,8.37762,IBK5411 ,38000,419,225,0

Before and after data aggregation

And the best: instead of 100MB per day my new aggregated log file is only 3 MB/day. Hooray!

The resulting CSV file can now easily be geo-plotted – e.g. with cartodb.com. 
If you filter by altitude, you can nicely see the departing/landing flights without the fly-overs….

Quick visualization using cartodb.com 
Have a look at some of my maps at https://matthiasadsb.cartodb.com/ 

CAVEATS?

Yes, my script is fairly basic. But it works quite well so far.
No input sanitizing  …  (I hope there is no callsign “rm *” or “drop table;”)
Memory management (the array should be cleaned up from time to time, otherwise it accumulates ALL aircraft).
Drops aircraft data without a callsign (I noticed some military aircraft do send position data but not call sign)
Carries over the last callsign one spot – even if the same aircraft now departs with a different flight number.

Here is the Script - Feel free to improve

pi@gemelli-pi03:~ $ cat dump2sql.sh
#!/bin/bash
#Shell script to listen parse and aggregate ADSB data
#Run in foreground to see progress bar
#run  in background with: dump2sql.sh >/dev/null &

Timestamp=$(date +"%Y-%m-%d")   #separate logfile every day
ADSBhost="192.168.0.xxxx"
ADSBport="30003"
ADSBlog="/home/pi/adsb-log-$Timestamp.txt"
#now the mySQL credentials...use certificates in next version
SQLsrv='XXXXXX.YYYYYY.eu-central-1.rds.amazonaws.com'
SQLusr="ZZZZZ"
SQLpwd="*****"
SQLdbs="DDDDDD"
SQLqueries="/home/pi/dump.sql"
counter=0       #for loop control
countmax=200    #script can stop after so many loops

echo "DUMP1090 Aggregator2mySQL by Matthias Gemelli 2016"
echo "listening to server: $ADSBhost writing to log $ADSBlog"
echo "mySQL as $SQLusr to $SQLsrv"
echo "progress bar: . for every message, X for every location, Y for missing cal                                                                                                                lsign"
echo "exit with Ctrl-C or set message limit in countmax"
echo "-------"
echo "MsgCount,HexIdent,Date,Time,Lat,Long,Callsign,Altitude,Speed,Track,Vertica                                                                                                                l" >> "$ADSBlog"

#now declare the arrays
declare -a arr_call
declare -a arr_alti
declare -a arr_sped
declare -a arr_trck
declare -a arr_vert

#loop through the Netcat data
nc -d $ADSBhost $ADSBport | while IFS="," read -r f1 f2 f3 f4 f5 f6 f7 f8 f9 f10                                                                                                                 f11 f12 f13 f14 f15 f16 f17 f18 f19 f20 f21 f22
do     #loop until a break is thrown

#first update the filename
Timestamp=$(date +"%Y-%m-%d")   #separate logfile every day
ADSBlog="/home/pi/adsb-log-$Timestamp.txt"

#now read the relevant data fields in every ADSB record
#echo "Field 05 HexIdent         :$f2"
#echo "Field 07 Date message gen :$f7"
#echo "Field 08 Time message gen :$f8"
#echo "Field 11 Callsign         :$f11"
#echo "Field 12 Altitude         :$f12"
#echo "Field 13 GroundSpeed      :$f13"
#echo "Field 14 Track            :$f14"
#echo "Field 15 Latitude         :$f15"
#echo "Field 16 Longitude        :$f16"
#echo "Field 17 Vertical Rate    :$f17"

#now save the data into array, using HexIdent as index
ident=$((0x${f5}))
if [ "$f11" != "" ];  then arr_call[ident]="$f11"; fi
if [ "$f12" != "" ];  then arr_alti[ident]="$f12"; fi
if [ "$f13" != "" ];  then arr_velo[ident]="$f13"; fi
if [ "$f14" != "" ];  then arr_trck[ident]="$f14"; fi
if [ "$f17" != "" ];  then arr_vert[ident]="$f17"; fi

#if position and if callsign is broadcast
if [ "$f15" != "" ]; then  #if f15 not empty
if [ "${arr_call[ident]}" != "" ]; then #if callsign is already known

punkt="$counter,$f5,$f7,$f8,$f15,$f16,${arr_call[ident]},${arr_alti[ident]}"
punkt="$punkt,${arr_velo[ident]},${arr_trck[ident]},${arr_vert[ident]}"
echo "$punkt" >> "$ADSBlog"      #write to log file

#now compose SQL statement
QUERY="INSERT INTO esp8266data.flights "
QUERY="$QUERY(msgcount,hexident,date,time,lat,lon,sign,alti,speed,trck,vert) VAL                                                                                                                UES"
QUERY="$QUERY ($counter,\"$f5\",\"$f7\",\"$f8\",\"$f15\",\"$f16\","
QUERY="$QUERY \"${arr_call[ident]}\",${arr_alti[ident]},${arr_velo[ident]},"
QUERY="$QUERY ${arr_trck[ident]},${arr_vert[ident]});"
echo "$QUERY" >$SQLqueries  #write SQL to a file before executing
#echo "$QUERY"
mysql -h $SQLsrv -u $SQLusr -p$SQLpwd $SQLdbs <$SQLqueries

#progress bar on shell - X for position, Y for pos without callsign
printf "X"
else  #if no callsign is known at position
printf "Y $f5"
fi #if callsign is aleady known
else  #what to do if no position is given
printf "."  #progress bar
fi #if f15 not empty


#reset the array if it is midnight (fewer planes)
#if reached max counter then break from loop
#if [ "$counter" -gt "$countmax" ]; then break; fi
((counter++))   #increase counter

done            #netcat listener loop
#attention: variables set within the loop stay in the loop
echo "done"