Tuesday, February 9, 2016

ADS-B Aircraft Data Collection & Aggregation

Understanding and aggregating ADSB message data

After installing DUMP1090 I was looking for ways to collect the position data for later analysis.
You can easily display the dump1090 data feed with netcat: nc –d localhost 30003. And of course you can simply run netcat in the background and write the data to file with  nc -d localhost 30003 >> adsb.csv
raw data from DUMP1090
This “base station format” is well explained here http://www.airnavsystems.com/forum/index.php?topic=2896.0 You will quickly realize this raw data needs to aggregated, as different message types send different data – but never at the same time.

ADS-B data before aggregation

MY SOLUTION? A LITTLE SHELL SCRIPT THAT… 

  • continously pipes the netcat raw data through a simple parser (IFS):
    nc -d $ADSBhost $ADSBport | while IFS="," read -r f1 f2 f3 f4 f5 f6 f7 f8 f9 f10 f11 f12 f13 f14 f15 f16 f17 f18 f19 f20 f21 f22
    do #loop until a break is thrown
  • lets me read each individual value in each individual message:
    #echo "Field 05 HexIdent         :$f5"
    #echo "Field 07 Date message gen :$f7"
    #echo "Field 08 Time message gen :$f8"
    #echo "Field 11 Callsign         :$f11"
    #echo "Field 12 Altitude         :$f12"
    #echo "Field 13 GroundSpeed      :$f13"
    #echo "Field 14 Track            :$f14"
    #echo "Field 15 Latitude         :$f15"
    #echo "Field 16 Longitude        :$f16"
    #echo "Field 17 Vertical Rate    :$f17"
  • Saves the relevant data into an array
    ident=$((0x${f5})) #convert Aircraft hex ID to an number
    if [ "$f11" != "" ];  then arr_call[ident]="$f11"; fi
    if [ "$f12" != "" ];  then arr_alti[ident]="$f12"; fi
    if [ "$f13" != "" ];  then arr_velo[ident]="$f13"; fi
    if [ "$f14" != "" ];  then arr_trck[ident]="$f14"; fi
    if [ "$f17" != "" ];  then arr_vert[ident]="$f17"; fi
  • And finally write the data only when both position and callsign are known
    #if position and if callsign is broadcast
    if [ "$f15" != "" ]; then   #if f15 Latitude not empty
    if [ "${arr_call[ident]}" != "" ]; then #if callsign is already known
    MapPoint="$counter,$f5,$f7,$f8,$f15,$f16,${arr_call[ident]},${arr_alti[ident]}"
    MapPoint="$punkt,${arr_velo[ident]},${arr_trck[ident]},${arr_vert[ident]}"
    echo "$ MapPoint" >> "$ADSBlog"      #write to log file

The Result? 

Now, instead of the original dump1090 messages like this: 
MSG,7,111,11111,47873D,111111,2016/02/09,21:16:09.550,2016/02/09,21:16:09.533,,36000,,,,,,,,,,0
MSG,5,111,11111,47873D,111111,2016/02/09,21:16:10.983,2016/02/09,21:16:10.974,,35975,,,,,,,0,,0,0

I have a complete data set like this:
MsgCount,HexIdent,Date,Time,Lat,Long,Callsign,Altitude,Speed,Track,Vertical
250,47875D,2016/02/09,21:12:35.282,53.53898,8.41868,IBK5411 ,38000,418,225,0
365,47875D,2016/02/09,21:12:53.369,53.51395,8.37762,IBK5411 ,38000,419,225,0

Before and after data aggregation

And the best: instead of 100MB per day my new aggregated log file is only 3 MB/day. Hooray!

The resulting CSV file can now easily be geo-plotted – e.g. with cartodb.com. 
If you filter by altitude, you can nicely see the departing/landing flights without the fly-overs….

Quick visualization using cartodb.com 
Have a look at some of my maps at https://matthiasadsb.cartodb.com/ 

CAVEATS?

Yes, my script is fairly basic. But it works quite well so far.
No input sanitizing  …  (I hope there is no callsign “rm *” or “drop table;”)
Memory management (the array should be cleaned up from time to time, otherwise it accumulates ALL aircraft).
Drops aircraft data without a callsign (I noticed some military aircraft do send position data but not call sign)
Carries over the last callsign one spot – even if the same aircraft now departs with a different flight number.

Here is the Script - Feel free to improve

pi@gemelli-pi03:~ $ cat dump2sql.sh
#!/bin/bash
#Shell script to listen parse and aggregate ADSB data
#Run in foreground to see progress bar
#run  in background with: dump2sql.sh >/dev/null &

Timestamp=$(date +"%Y-%m-%d")   #separate logfile every day
ADSBhost="192.168.0.xxxx"
ADSBport="30003"
ADSBlog="/home/pi/adsb-log-$Timestamp.txt"
#now the mySQL credentials...use certificates in next version
SQLsrv='XXXXXX.YYYYYY.eu-central-1.rds.amazonaws.com'
SQLusr="ZZZZZ"
SQLpwd="*****"
SQLdbs="DDDDDD"
SQLqueries="/home/pi/dump.sql"
counter=0       #for loop control
countmax=200    #script can stop after so many loops

echo "DUMP1090 Aggregator2mySQL by Matthias Gemelli 2016"
echo "listening to server: $ADSBhost writing to log $ADSBlog"
echo "mySQL as $SQLusr to $SQLsrv"
echo "progress bar: . for every message, X for every location, Y for missing cal                                                                                                                lsign"
echo "exit with Ctrl-C or set message limit in countmax"
echo "-------"
echo "MsgCount,HexIdent,Date,Time,Lat,Long,Callsign,Altitude,Speed,Track,Vertica                                                                                                                l" >> "$ADSBlog"

#now declare the arrays
declare -a arr_call
declare -a arr_alti
declare -a arr_sped
declare -a arr_trck
declare -a arr_vert

#loop through the Netcat data
nc -d $ADSBhost $ADSBport | while IFS="," read -r f1 f2 f3 f4 f5 f6 f7 f8 f9 f10                                                                                                                 f11 f12 f13 f14 f15 f16 f17 f18 f19 f20 f21 f22
do     #loop until a break is thrown

#first update the filename
Timestamp=$(date +"%Y-%m-%d")   #separate logfile every day
ADSBlog="/home/pi/adsb-log-$Timestamp.txt"

#now read the relevant data fields in every ADSB record
#echo "Field 05 HexIdent         :$f2"
#echo "Field 07 Date message gen :$f7"
#echo "Field 08 Time message gen :$f8"
#echo "Field 11 Callsign         :$f11"
#echo "Field 12 Altitude         :$f12"
#echo "Field 13 GroundSpeed      :$f13"
#echo "Field 14 Track            :$f14"
#echo "Field 15 Latitude         :$f15"
#echo "Field 16 Longitude        :$f16"
#echo "Field 17 Vertical Rate    :$f17"

#now save the data into array, using HexIdent as index
ident=$((0x${f5}))
if [ "$f11" != "" ];  then arr_call[ident]="$f11"; fi
if [ "$f12" != "" ];  then arr_alti[ident]="$f12"; fi
if [ "$f13" != "" ];  then arr_velo[ident]="$f13"; fi
if [ "$f14" != "" ];  then arr_trck[ident]="$f14"; fi
if [ "$f17" != "" ];  then arr_vert[ident]="$f17"; fi

#if position and if callsign is broadcast
if [ "$f15" != "" ]; then  #if f15 not empty
if [ "${arr_call[ident]}" != "" ]; then #if callsign is already known

punkt="$counter,$f5,$f7,$f8,$f15,$f16,${arr_call[ident]},${arr_alti[ident]}"
punkt="$punkt,${arr_velo[ident]},${arr_trck[ident]},${arr_vert[ident]}"
echo "$punkt" >> "$ADSBlog"      #write to log file

#now compose SQL statement
QUERY="INSERT INTO esp8266data.flights "
QUERY="$QUERY(msgcount,hexident,date,time,lat,lon,sign,alti,speed,trck,vert) VAL                                                                                                                UES"
QUERY="$QUERY ($counter,\"$f5\",\"$f7\",\"$f8\",\"$f15\",\"$f16\","
QUERY="$QUERY \"${arr_call[ident]}\",${arr_alti[ident]},${arr_velo[ident]},"
QUERY="$QUERY ${arr_trck[ident]},${arr_vert[ident]});"
echo "$QUERY" >$SQLqueries  #write SQL to a file before executing
#echo "$QUERY"
mysql -h $SQLsrv -u $SQLusr -p$SQLpwd $SQLdbs <$SQLqueries

#progress bar on shell - X for position, Y for pos without callsign
printf "X"
else  #if no callsign is known at position
printf "Y $f5"
fi #if callsign is aleady known
else  #what to do if no position is given
printf "."  #progress bar
fi #if f15 not empty


#reset the array if it is midnight (fewer planes)
#if reached max counter then break from loop
#if [ "$counter" -gt "$countmax" ]; then break; fi
((counter++))   #increase counter

done            #netcat listener loop
#attention: variables set within the loop stay in the loop
echo "done"

8 comments:

  1. What nc do you use? Mine does not support the -d flag. Whats the meaning of it.

    ReplyDelete
    Replies
    1. I'm using the standard netcat in the Raspian Jessie Lite image.
      Nothing exotic...

      Delete
    2. Debian 7.9 (Wheezy) provides two different nc packages. netcat-traditional the one without the -d switch and netcat-openbsd with -d switch.

      Delete
  2. Hallo,
    Keine schlechte Idee und ich habe es mal ausprobiert, nur steigt das Skript bei mir nach einer bestimmten Zeit einfach aus. Den Break habe ich auskommentiert. Das ist immer ungefähr dann wenn der counter bei 14-15000 ist.
    Vielleicht eine Idee? Sind ca. 4 Positionen pro Sekunde die bei mir reinkommen

    ReplyDelete
  3. Thanks. Works fine. See

    http://blog.wenzlaff.de/?p=6706

    and

    http://blog.wenzlaff.de/?p=6694

    ReplyDelete
  4. Wie es aussieht kommt das Script nicht mit der Menge von Datensätzen zurecht.
    Bei mir kommt der Counter auf ca 4000 - das entspricht einer Laufzeit von 30-35 Sekunden. Dann wird nc wieder gestartet und die CSV-Datei hat zeitliche Lücken über denselben Zeitraum. Wenn erfasst wird sind es ca 45 Position pro Sekunde in der CSV-Datei. Nach 21 Stunden enthält die CSV-Datei 571835 Datensätze und ist 45 MB groß. dump1090 zeigt mir bis zu 900 Messages/sec. und bis zu 120 Flugzeuge (davon über 100 mit Positionen) gleichzeitig an.
    Das ganze läuft auf einem HP N54 Server und connected sich über Port 30003 zum Raspberry Pi. Dabei steigt auch die Load auf dem Server auf bis zu 1.3 - im "Leerlauf" hab ich sonst 0.2 - 0.3. Lasse ich nc nur so verbinden und auf die Console schreiben, habe ich keine Aussetzer.

    ReplyDelete
    Replies
    1. Hallo Shortie - nein, Menge der Datensätze ist nicht das Problem, bei mir läuft der Script jetzt eine Woche auf nem normalen Raspi2 und hat einen Messagecounter von 7,9 Mio erreicht.

      Problem ist dass der netcat loop öfter abbricht. Hab darum in der neuen Version einen Loop draussen rum der die Unterbrechungen loggt. Damit klappts wieder.

      Schau mal unter http://10pm-blog.blogspot.de/2016/03/new-improved-ads-b-aircraft-data.html

      Ansonsten debugge ich gerne mit vielen ECHOs...hab die im Code stehen lassen :-)

      Matthias

      Delete
  5. Das neuere Script hatte ich dazu schon im Einsatz. Wie gesagt: lasse ich nc auf die Console schreiben bricht nichts ab. In einer ruhigen Minute werde ich wohl nochmals den Ansatz verfolgen einen mysql-Client direkt in dump1090 zu patchen.

    ReplyDelete