Archive for December, 2020

Setting up NodeJS + MongoDB application

Continuing from:

https://tthtlc.wordpress.com/2020/12/26/setting-nginx-mariadb-server-php-nodejs-mongodb-application/

Basics of MongoDB and NodeJS is covered in other articles.

Now we will import a dataset in csv (named as “xxxx.csv”) file format into our MongoDB database. The “mongo” command is used to drop the collection in the database, and “mongoimport” is used to import the csv file into MongoDB.

The database name is named as “iot_traffic” and the collection is named “dataset1”:

mongo iot_traffic --eval 'db.dataset1.drop()'
mongoimport --type csv -d iot_traffic -c dataset1 --headerline xxxx.csv

After import, next is to read the database and display it. Put the following content into a javascript file (for example, “mongo_list_data.js”)

const mongo = require('mongodb');
const url = "mongodb://localhost:27017";

mongo.connect(url, {useNewUrlParser: true}, (err, db) => {
if(err) {
console.log(err);
process.exit(0);
}
var dbo = db.db('iot_traffic');
console.log('database connected!');
var collection = dbo.collection('dataset1');
collection.find().toArray((err, results) => {
if(err) {
console.log(err);
process.exit(0);
}
console.log(results);
db.close();
});
});

And then execute the javascript file via “nodejs mongo_list_data.js”):

As noted above, there is no username information needed to access the database.

But if the a user is created as follows (put this into a file and called it “create_user.sh”):

mongo <<EOF
use iot_traffic
db.dropUser( "iotuser" )
db.createUser( { user: "iotuser",
pwd: "xxxxxx",
customData: { employeeId: 12345 },
roles: [ { role: "readWrite", db: "iot_traffic" } ] },
{ w: "majority" , wtimeout: 5000 } )

EOF

So effectively the “mongo” command is used to create the database user.

Then to access the database via username and password will be as follows (username is “iotuser” and password is “xxxx”), put this into a javascript file and named it as “mongo_data_list.js”)

const mongo = require('mongodb');
const url = "mongodb://iotuser:xxxx@localhost:27017/iot_traffic";


mongo.connect(url, {useNewUrlParser: true}, (err, db) => {
if(err) {
console.log(err);
process.exit(0);
}
var dbo = db.db('iot_traffic');
console.log('database connected!');
var collection = dbo.collection('dataset1');
collection.find().toArray((err, results) => {
if(err) {
console.log(err);
process.exit(0);
}
console.log(results);
db.close();
});
});

And then to verify that the user is created, you have to be inside the “iot_traffic” database. First using “mongo” command:

And then execute the file as “nodejs mongo_data_list.js”) and the output will be the same below:

Alternatively you can use the following program (using “MongoClient”) to connect to MongoDB:

const MongoClient = require('mongodb').MongoClient;
const uri = "mongodb://iotuser:xxxx@localhost:27017/iot_traffic?retryWrites=true&w=majority";
const client = new MongoClient(uri, { useNewUrlParser: true });
client.connect(err => {
// creating collection
const collection = client.db("iot_traffic").collection("dataset1");
// perform actions on the collection object

client.close();
});

References:

https://stackoverflow.com/questions/16124255/how-to-connect-with-username-password-to-mongodb-using-native-node-js-driver

https://stackoverflow.com/questions/24985684/mongodb-show-all-contents-from-all-collections: How to use “mongo” command.

http://mongodb.github.io/node-mongodb-native/api-generated/mongoclient.html: MongoClient documentation

https://stackoverflow.com/questions/24985684/mongodb-show-all-contents-from-all-collections

RapidMiner + Deep Learning: IoT traffic Malware Detection

Applying the previous IoT Traffic Malware Detection + Rapidminer + kNN:

https://tthtlc.wordpress.com/2020/12/27/rapidminer-knn-iot-traffic-for-malware-detection/

to using Deep Learning as modelling:

First is trying out with only ONE column as label (for supervised learning):

Next is to increase the number of label attribute to TWO: detailed-label + label.

And the result is:

All other configuration is unchanged:

The “deep learning” extension used in Rapidminer is based H2O Deep Learning:

https://docs.h2o.ai/h2o/latest-stable/h2o-docs/data-science/deep-learning.html

http://docs.h2o.ai/h2o-tutorials/latest-stable/tutorials/deeplearning/index.html

And here below are the configuration available in Rapidminer:


RapidMiner + kNN + IoT Traffic for malware detection

The dataset used is from:

Stratosphere Laboratory. A labeled dataset with malicious and benign IoT network traffic. January 22th. Agustin Parmisano, Sebastian Garcia, Maria Jose Erquiaga.

https://www.stratosphereips.org/datasets-iot23

A lighter version containing only the labeled flows without the pcaps files (8.8 GB) here:

https://mcfp.felk.cvut.cz/publicDatasets/IoT-23-Dataset/iot_23_datasets_small.tar.gz

As quoted from IoT-23 website:

IoT-23 is a new dataset of network traffic from Internet of Things (IoT) devices. It has 20 malware captures executed in IoT devices, and 3 captures for benign IoT devices traffic. It was first published in January 2020, with captures ranging from 2018 to 2019. This IoT network traffic was captured in the Stratosphere Laboratory, AIC group, FEL, CTU University, Czech Republic. Its goal is to offer a large dataset of real and labeled IoT malware infections and IoT benign traffic for researchers to develop machine learning algorithms. This dataset and its research is funded by Avast Software, Prague.
The IoT-23 dataset consists of twenty three captures (called scenarios) of different IoT network traffic. These scenarios are divided into twenty network captures (pcap files) from infected IoT devices (which will have the name of the malware sample executed on each scenario) and three network captures of real IoT devices network traffic (that have the name of the devices where the traffic was captured). On each malicious scenario we executed a specific malware sample in a Raspberry Pi, that used several protocols and performed different actions. Table 1 shows the characteristics of the IoT botnet scenarios and Table 2 shows the protocols that were found in each network traffic capture. The network traffic captured for the benign scenarios was obtained by capturing the network traffic of three different IoT devices: a Philips HUE smart LED lamp, an Amazon Echo home intelligent personal assistant and a Somfy smart doorlock. It is important to mention that these three IoT devices are real hardware and not simulated (see Images 1,2 and 3) . This allows us to capture and analyse real network behaviour. Both malicious and benign scenarios run in a controlled network environment with unrestrained internet connection like any other real IoT device.

First a bash shell script is used to clean up the dataset:

1. Converting all the delimiter (which binary data 0x09) to “,”: sed ‘s/x09/,/g’

2. Due to some error, the last three columns does not have any 0x09 delimiter but delimited by 3 space characters instead: sed ‘s/ /,/g’

3. The header names are cleanup as some extra names are inside and first few rows are deleted as they are self-documentation.

4. The single last row is andend-indicator – just remove it.

The result is the following:

https://drive.google.com/file/d/1EK_SFr4ecXNP_Hez9CVDNa-YWt-o1Xv5/view?usp=sharing

Other than the above manual changes, every other row can be imported into RapidMiner without errors.

First we import the dataset in RapidMiner:

One has about 23K examples (from the 1-1 dataset):

And another has about 1million examples (from the 34-1 dataset):

And there are two columns which can be used as the label for supervised learning – which we will use one “label” for the time being.

For SELECT attribute:

For SET-role, the column “label” is selected as “label” attribute.

For Split validation, we use 70% split ratio (70% for kNN computation, and 30% for validation and accuracy computation):

Internal Split Validation is like this:

And the accuracies generated is based on a few criteria:

and this is the output:

This is where you can download a copy of RapidMiner Studio to try (it has both Linux and Windows version and here I am using the Linux one):
https://docs.rapidminer.com/latest/studio/installation/

Getting started guide:

https://academy.rapidminer.com/learning-paths/get-started-with-rapidminer-and-machine-learning

Setting up Nginx + MariaDB server + Php + NodeJS + MongoDB application

This is how I have setup the applications after provisioning an Ubuntu client on the DigitalOcean platform:

apt-get install nginx

apt-get install net-tools

apt-get install php

apt-get install php-fpm

apt-get install mariadb-server mariadb-client

mysql_secure_installation:

mysql -u root -p ====> check to see if the MySQL prompt appear:

systemctl status mariadb

(Other miscellaneous possibly not included “apt-get install build-essential git bison flex libxml2-dev pkg-config sqlite3 libsqlite3-dev”)

Other useful commands:

systemctl status nginx.service
systemctl restart nginx.service
systemctl status mariadb

Now we will stop MariaDB and disable it from restarting upon reboot:

systemctl stop mariadb.service
systemctl disable mariadb.service

And install NodeJS + MongoDB instead:

apt-get install nodejs
apt-get install mongodb-server

Both netstat -natp and systemctl status checking on the status of MongoDB running:

This is how you can connect to your MongoDB at the terminal for development:

https://www.mongodb.com/blog/post/quick-start-nodejs-mongodb–how-to-get-connected-to-your-database

https://www.w3schools.com/nodejs/nodejs_mongodb_query.asp

https://www.mongodb.com/blog/post/quick-start-nodejs-mongodb–how-to-get-connected-to-your-database

https://medium.com/analytics-vidhya/import-csv-file-into-mongodb-9b9b86582f34

TO BE CONTINUED…..

Searching Linux kernel for stack/heap/integer overflow

How to search linux kernel source code for stack and integer overflow?

First is “cpy.*argv” pattern (either “strcpy” or memcpy direct from argv, or input, without prior checking its length, resulting in stack or heap overflow):

image

Next is “cpy.*+” pattern (mathematical operation done inside “strcpy” or “memcpy” without checking if it is valid, potentially either stack or heap overflow, or could be integer overflow):

image 2

Or this one:

image 1

Last one is “malloc.*+” (mathematical operation done without checking if it is valid before allocating it, likely integer overflow):

image 3

Vickblöm

Research scattered with thoughts, ideas, and dreams

Penetration Testing Lab

Offensive Techniques & Methodologies

Astr0baby's not so random thoughts _____ rand() % 100;

@astr0baby on Twitter for fresh randomness

The Data Explorer

playing around with open data to learn some cool stuff about data analysis and the world

Conorsblog

Data | ML | NLP | Python | R

quyv

Just a thought

IFT6266 - H2017 Deep Learning

A Graduate Course Offered at Université de Montréal

Deep Learning IFT6266-H2017 UdeM

Philippe Paradis - My solutions to the image inpainting problem

IFT6266 – H2017 DEEP LEARNING

Pulkit's thoughts on the course project

Thomas Dinsmore's Blog

No man but a blockhead ever wrote except for money -- Samuel Johnson

the morning paper

a random walk through Computer Science research, by Adrian Colyer

The Spectator

Shakir's Machine Learning Blog