Sunday, March 6, 2016

NEW & IMPROVED: ADS-B Data Collection

My first scripts worked well - see post at http://10pm-blog.blogspot.de/2016/02/ads-b-aircraft-data-collection.html

But...after installing Dump1090 Mutability the script often stopped working.
Also noticed some SQL errors. Time for some debugging and better error handling.

Here are my revised DUMP2SQL scripts

Topology - Data Flow

Main improvements:

  • Original script often stopped when Netcat did not receive data - fixed with another simple loop around. Added some logging to see how often this happens....
  • Now include data without a callsign - and immediately tracked a sail plane in the area...  :-)
  • Fixed SQL errors from empty fields - now passing "NULL" as value
  • Segregated the data collection from mySQL upload - to avoid missing ADS- messages when busy with. This way you can trigger the mySQL insert every few minutes by Crontab and not for every single message.
  • Added mySQL error handling and logging. SQL files are deleted when insert was successful (CSV data remains untouched).
  • Changed the SQL table layout (not a good idea to use reserved names for columns such as "index")

All the files-....

Results? Working absolutely fine!

No more missed ADS-B messages (averaging now 120.000 per day instead of 80.000).
No more SQL errors.
No more missed sail planes or helicopters (that fly without a callsign).

Collected ADS-B data in both CSV and SQL format






dump2sql.sh  (main data aggregation script)

Main purpose of this script is to collect all ADS-B messages, and aggregate them to complete missing fields (e.g. to carry-over the call-sign) and to save data only when location is known.

Launch this script at startup with crontab (sudo crontab -e) and insert a line:
@reboot /home/pi03/dump2sql.sh >/dev/null &


#!/bin/bash
#Shell script to listen parse and aggregate ADSB data
#Run in foreground to see progress bar
#run  in background with: dump2sql.sh >/dev/null &

#variables DUMP1090 server
ADSBhost="localhost"
ADSBport="30003"
#variables  mySQL credentials...use certificates in next version
SQLsrv='xxxx.yyyy.eu-central-1.rds.amazonaws.com'
SQLusr="xxxx"
SQLpwd="xxxxx"
SQLdbs="xxxxx"
SQLqueries="/home/pi03/dump.sql"
#now declare the arrays
counter=0  #for loop control
declare -a arr_call    #collect call_signs per aircraft
declare -a arr_alti    #collect altitude per aircraft
declare -a arr_sped    #collect groundspeed per aircraft
declare -a arr_trck    #collect track per aircraft
declare -a arr_vert    #collect vertical speed per aircraft
#variables-logfiles
TimestampDaily=$(date +"%Y-%m-%d")        #separate logfile every day
TimestampMinut=$(date +"%Y-%m-%d-%H-%M")   #separate SQL file every minute
ADSB_log="/home/pi03/adsb-log.txt"   #status log -
ADSB_csv="/home/pi03/adsb-csv-"   #data capture - collected per day - append TimestampDaily
ADSB_sql="/home/pi03/adsb-sql-"   #SQL statements - append TimestampMinut

#Startup Messages
echo "DUMP1090 Aggregator2mySQL by Matthias Gemelli 2016"
echo "Listening to DUMP1090 on $ADSBhost port $ADSBport"
echo "Writing Logs to $ADSB_log"
echo "Writing Data to $ADSB_dat"
echo "Writing SQL  to $ADSB_sql"
echo "SQL insert to $SQLsrv"
echo "--------------------------------"

#Startup log
Timestamp=$(date +"%Y-%m-%d %H:%M:%S")
echo "Dump2SQL launched with PID $$ at $Timestamp" >> $ADSB_log
echo "MsgCount,HexIdent,Date,Time,Lat,Long,Callsign,Altitude,Speed,Track,Vertical" >> "$ADSB_csv                                                                                                 $TimestampDaily.csv"
echo "select count(*) from flights;" >> "$ADSB_sql$TimestampMinut.sql"

#----------------------LOOP starts------------------------
while true; do #outer loop - because netcat stops too often with Dump1090 mutability
#inner loop - netcat listener
nc -d $ADSBhost $ADSBport | while IFS="," read -r f1 f2 f3 f4 f5 f6 f7 f8 f9 f10 f11 f12 f13 f14                                                                                                  f15 f16 f17 f18 f19 f20 f21 f22
do     #loop until a break is thrown or netcat stops

#first update the timestamp for log files
TimestampDaily=$(date +"%Y-%m-%d")        #separate logfile every day
TimestampMinut=$(date +"%Y-%m-%d-%H-%M")  #separate SQL file every minute

#relevant data fields in every ADSB record
#echo "Field 05 HexIdent         :$f5"
#echo "Field 07 Date message gen :$f7"
#echo "Field 08 Time message gen :$f8"
#echo "Field 11 Callsign         :$f11"
#echo "Field 12 Altitude         :$f12"
#echo "Field 13 GroundSpeed      :$f13"
#echo "Field 14 Track            :$f14"
#echo "Field 15 Latitude         :$f15"
#echo "Field 16 Longitude        :$f16"
#echo "Field 17 Vertical Rate    :$f17"

#now save the data into array, using HexIdent as index
#overwrite only if field is not empty
ident=$((0x${f5}))   #convert hex to decimal
if [ "$f11" != "" ];  then arr_call[ident]="$f11"; fi
if [ "$f12" != "" ];  then arr_alti[ident]="$f12"; fi
if [ "$f13" != "" ];  then arr_velo[ident]="$f13"; fi
if [ "$f14" != "" ];  then arr_trck[ident]="$f14"; fi
if [ "$f17" != "" ];  then arr_vert[ident]="$f17"; fi
#write default values - important for SQL insert
if [ "${arr_call[ident]}" = "" ]; then arr_call[ident]="unknown"; fi
if [ "${arr_alti[ident]}" = "" ]; then arr_alti[ident]="NULL"; fi
if [ "${arr_velo[ident]}" = "" ]; then arr_velo[ident]="NULL"; fi
if [ "${arr_trck[ident]}" = "" ]; then arr_trck[ident]="NULL"; fi
if [ "${arr_vert[ident]}" = "" ]; then arr_vert[ident]="NULL"; fi


#debugging and progress bar - display RAW-before - AGG-after aggregation data
#echo "Debug for $f5 Altitude $f12-${arr_alti[ident]} Flight $f11-${arr_call[ident]}"
#progress bar  X for every location, DOT for every ADSB-message
#if [ "$f15" != "" ]; then  printf "x"; else printf "."; fi


#----------save ADSB data only if location is known--------
#----------pointless to write speed...altitude without location
if [ "$f15" != "" ]; then  #if Latitude is not empty
echo "Position Point received for $f5 ${arr_call[ident]} at alt ${arr_alti[ident]}"

#---------write CSV data and SQL commands to file-------
#echo "MsgCount,HexIdent,Date,Time,Lat,Long,Callsign,Altitude,Speed,Track,Vertical" >> "$ADSB_cs                                                                                                 v$TimestampDaily.csv"
CSV="$counter,$f5,$f7,$f8,$f15,$f16,${arr_call[ident]},${arr_alti[ident]}"
CSV="$CSV,${arr_velo[ident]},${arr_trck[ident]},${arr_vert[ident]}"
echo "$CSV" >> "$ADSB_csv$TimestampDaily.csv"

QUERY="INSERT INTO esp8266data.flightlog "
QUERY="$QUERY (logmsg, loghexid, logdate, logtime, latitude, longitude, "
QUERY="$QUERY logsign, altitude, speed, track, vertical) VALUES "
QUERY="$QUERY ($counter,\"$f5\",\"$f7\",\"$f8\",\"$f15\",\"$f16\","
QUERY="$QUERY \"${arr_call[ident]}\",${arr_alti[ident]},${arr_velo[ident]},"
QUERY="$QUERY ${arr_trck[ident]},${arr_vert[ident]});"
echo "$QUERY" >> "$ADSB_sql$TimestampMinut.sql"

#instead of inserting every separate blip I collect into a SQL file and run in batches
#mysql -h $SQLsrv -u $SQLusr -p$SQLpwd $SQLdbs <$SQLqueries
#if [ $? != 0 ]; then echo "error: $QUERY" >> $ADSB_log ; fi

fi #if position is not empty

#reset the array if it is midnight (fewer planes)
((counter++))   #increase the loop counter
done            #netcat listener loop
Timestamp=$(date +"%Y-%m-%d %H:%M:%S")
echo "Dump2SQL Netcat stopped...Re-Launch with PID $$ at $Timestamp" >> $ADSB_log

done            #outer loop
#-----------------end of the loops----------------
#attention: variables set within the loop stay in the loop

echo "Done for the day..."
echo "Dump2sql done" >> $ADSB_log
echo $(date +"%Y-%m-%d %H:%M%S") >> $ADSB_log

mysql-ins.sh  (insert the data to mySQL)

Main purpose of this script is to insert the flight data into a mySQL database using the mysql-client software package. It automatically looks for the SQL files generated with the dump2sql script and only processes the ones older than one minute (to prevent processing files that are still being written to).
Launch this script every 5 minutes startup with crontab (sudo crontab -e) and insert a line:
*/5 * * * * /home/pi03/mysql-ins.sh >/dev/null
#!/bin/bash

#variables  mySQL credentials...use certificates in next version
SQLsrv='xxxx.yyyy.eu-central-1.rds.amazonaws.com'
SQLusr="xxxx"
SQLpwd="xxxxxxxx"
SQLdbs="xxxxxxx"
SQL_log="/home/pi03/mysql-log.txt"   #status log -
Timestamp=$(date +"%Y-%m-%d-%H-%M")   #separate logfile every day

#Now cycle through all SQL files and process them with mySQL
FILES=/home/pi03/adsb*sql
for f in $FILES
do
  #echo "Processing $f file..."
  #ls -ls $f
  Timestamp=$(date +"%Y-%m-%d-%H-%M")
  #now check if file is old enough to touch
  if [[ $(find $f -mmin +1) != "" ]]
        then #file found that is older as one minute
        #echo "old $f"
        #now launch SQL command
        SQLout=$(mysql -h $SQLsrv -u $SQLusr -p$SQLpwd $SQLdbs <$f 2>&1)
        SQLerr=$?
        if [ $SQLerr != 0 ];
                then  #SQL error
                echo "Error in $f with $SQLout" >>$SQL_log
                cp $f /home/pi03/err$f     #copy the error SQL file
                rm $f  #delete the original file
                else
                echo "Successfully Processed $f" >>$SQL_log
                rm $f   #delete SQL file
                fi  #SQL error
        else #file is fresh - dont touch it
        echo "Ignored fresh file $f" >>$SQL_log
        fi #find mmin1 - check for file age

done

Good Luck! Feedback and improvement ideas very welcome!

7 comments:

  1. Hi, is there a SQL script to create the flightlog database?

    ReplyDelete
  2. Its OK I've figured that out.

    ReplyDelete
  3. I have this up and running but it would appear there is something amiss. I've have the scripts running for nearly four hours and data is flowing into the database but the "logtime" information isn't keeping up with time. I have a min(logtime) of 22:08 which coincides with the time I started running the scripts, 11:08 local. I have a max(logtime) of 23:35 which would equate to a local time of 12:35 but it's now 15:00! The .sql files waiting on processing are currently 14:59 - 15:05 and the last log entry stated that the 14:58 file was successfully processed inferring it is now in the database. When I look into the 15:05 sql file the times are all labelled 23:36:40.xxx to 23:46:45.xxx thus only about 5 seconds of time difference leaving me very confused as to what's happening. Any ideas?

    ReplyDelete
  4. Just a question about what this means "esp8266data" in the "QUERY="INSERT INTO esp8266data.flightlog "" is this to be my MySQL database or is it something else?

    ReplyDelete
  5. Hi Nzgolly...
    Esp8266data us the name if my database...replace with whatever you have.
    Logtime is taken from the adsb feed - compare the raw data in csv with the sql data - maybe there us a glitch?

    ReplyDelete
  6. And uncomment the many echos to diagnose...i left them in from my own testing....

    ReplyDelete
  7. Hi Matthias, the times issue I posted above were resolved by not having the MySQL server located remotely, by havi a local instance on the same RPi as the decoder solved the issue. And thanks for the advice on the db name.

    ReplyDelete