Tuesday, February 10, 2015

Visualize ssh logs

Visualize ssh logs

I like R because it has a lot of packages that permits you to analyze the data and visualize it the easy way.

I will use this honeypot log from @Malwared_ to do a simple demo:

As always, we need to clean the data and prepare it, we are going to use GeoIP Python to accomplish this task with maxmind databases :

This scripts return a file called sshlog.csv with the following columns IP,CC,Country,AS,City,Lat,Long

Prior to visualization, let's take a quick look at data:

# Read the dataset 
sshpot<-read.table(file = "~/GIT/blog/sshlog.csv", na.strings=c("NA","NaN", ""), sep = ",")
colnames(sshpot)<-c("IP","CC","Country","AS","City","Lat","Long")
# Give an idea about the data 
# By country
head(sort(table(sshpot$Country),decreasing = TRUE))
## 
##              China      United States          Hong Kong 
##                934                215                156 
##             France Korea, Republic of            Germany 
##                 83                 49                 47
# By city
head(sort(table(sshpot$City),decreasing = TRUE))
## 
##           Huzhou Central District          Nanjing         Nanchang 
##              182              150              138              132 
##          Beijing            Hefei 
##               66               64
# By AS
head(sort(table(sshpot$AS),decreasing = TRUE))
## 
##                                              AS4134 Chinanet 
##                                                          595 
##     AS23650 AS Number for CHINANET jiangsu province backbone 
##                                                          104 
##                            AS4837 CNCGROUP China169 Backbone 
##                                                           62 
##                                        AS12876 ONLINE S.A.S. 
##                                                           61 
##                                     AS63854 HEE THAI LIMITED 
##                                                           53 
## AS4808 CNCGROUP IP network China169 Beijing Province Network 
##                                                           28

As supposed there are a lot of China stuff, let's visualize it:

# loading the required packages
require(rworldmap)

sPDF <- getMap()
dfC<-as.data.frame.table(table(sshpot$CC))
sPDF <-joinCountryData2Map( dfC,joinCode = "ISO2",nameJoinColumn = "Var1")

mapCountryData(sPDF,nameColumnToPlot="Freq",catMethod="pretty")
barplotCountryData( sPDF
 , nameColumnToPlot= "Freq"
 , nameCountryColumn = "NAME"
 , catMethod="pretty"
 , na.last = NA
 , decreasing = TRUE
 , scaleSameInPanels = TRUE
 , numPanels = 5
 , cex = 1.1 )

if we want take a closer look at China:

# We are going to zoom the China region
require(ggplot2)
require(ggmap)
map <- get_map( location = 'China', zoom = 4,
                source="google", maptype ="roadmap", scale = 1)

# plotting the map with points on it
ggmap(map) +
  geom_point(data = sshpot, aes(y = Long, x = Lat, fill = "red", alpha = 0.8), size = 5, shape = 21) +
  guides(fill=FALSE, alpha=FALSE, size=FALSE) 

References: