How to configure project to use cmake?

How do you use cmake in a project?    Suppose the project is named as “myllvm” and has only one file:  log.c and you have decided to use “gcc” to compile the C file.

This is the simplest CMakeLists.txt to get started:

The directory only have two files: log.c and CMakeLists.txt

To build the binary: either clang or gcc can be used, how to configure that?

First “mkdir build” and “cd build” and then issue the following command to use gcc to compile:

Otherwise, you can use clang (ie, replace the “gcc” with “clang”).

But it is also possible to modifiy CMakeLists.txt to have “gcc” specified inside.

How to add new directories with new C files?   (the subdirectory is call “core”):


How to add header directory into the compilation?

How to add multiple C or CPP files in “src” subdirectory?
#set(SOURCES src/mainapp.cpp src/Student.cpp)
#or just a widcard:
file(GLOB SOURCES “src/*.cpp”)
For more complete list of commands (for CMakeLists.txt) we can dig into cmake source codes and looking into “Help/command” directory.   Or the commands are listed here:
And when you do a “cmake -G <generator> these are the generate currently available (under Help/generator directory):
How to make use of the available cmake recipes written for each different platform:
For example here some Linux related cmake recipes:
And here are the Windows related cmake templates:
How to use the cmake recipes for different common tasks (eg, finding nvidia cuda compiler, find header files, etc) which are called Modules.
For example of a subtasks, these are all the “finding compiler” related modules (left hand side number is the number of lines for the file – a good representation of the complexity of the tasks it is handling):
Here are a complete list of all the tasks:
For more complete tutorial examples:


Understanding Amazon Web Services Security

First is to highlight the security issues that AWS occasionally have:

In particular is the ROBOT attack:

Amazon Web Services: Overview of Security Processes:

Tips for AWS firewall:

AWS IoT Security:

Application Development How to:

Development APIs and setup:


Characteristics of eBPF

a. it never can modify or read aribitrary parts of kernel memory, but only through known channels or mechanism.

b. if it uses FTRACE for tracing, then the dynamic instrumentation is delayed as long as possible – only when the code execution path is encountered. So if the user requested ebpf tracing of kernel APIs, many times it will not be instrumented until the point it is executed.

c. instead of directly running binary codes, it run a virtual machine to interpret binary codes. since this intermediate language must not have loops, it is verified so, and so is many other characteristics verified as well.

d. whenever it read parts of kernel memory, the data are only snapshots, and constantly changing. data-misses may happen, and this statistical by nature. memory buffer may be too small and reused quickly (ring buffer) – especially for network packet collections.

e. eBPF ==> no userspace dependencies, like FTRACE, and unlike SYSTEMTAP, so python, go, RUST, and so many other variations of usage of eBPF have been spawned. (and so similarly you have FTRACE + python, FTRACE + go and so on)

f. what is seccomp bpf?

"BPF does not define itself by only providing its instruction set, but also by offering further infrastructure around it such as maps which act as efficient key / value stores, helper functions to interact with and leverage kernel functionality, tail calls for calling into other BPF programs, security hardening primitives, a pseudo file system for pinning objects (maps, programs), and infrastructure for allowing BPF to be offloaded, for example, to a network card.

LLVM provides a BPF back end, so that tools like clang can be used to compile C into a BPF object file, which can then be loaded into the kernel. BPF is deeply tied to the Linux kernel and allows for full programmability without sacrificing native kernel performance.

Last but not least, also the kernel subsystems making use of BPF are part of BPF’s infrastructure. The two main subsystems discussed throughout this document are tc and XDP where BPF programs can be attached to. XDP BPF programs are attached at the earliest networking driver stage and trigger a run of the BPF program upon packet reception. By definition, this achieves the best possible packet processing performance since packets cannot get processed at an even earlier point in software. However, since this processing occurs so early in the networking stack, the stack has not yet extracted metadata out of the packet. On the other hand, tc BPF programs are executed later in the kernel stack, so they have access to more metadata and core kernel functionality. Apart from tc and XDP programs, there are various other kernel subsystems as well which use BPF such as tracing (kprobes, uprobes, tracepoints, etc)."

How can eBPF be compromised by vulnerabilities?

What are the security risks for allowing any runtimes codes to be executed in the kernel?

According to:

We can have:

  1. Privilege escalation
  2. Buffer overflow (You can see how spender tried to protect all the buffer for BPF processing here:, for example by using MACROS instead of some haphazard numbers like 64). Another to overflow the buffer is via “JIT spraying”: (
  3. Information leakage

Actual bug history: (information leakage from kernel to userland)

But BPF is a double edged sword:


Applications of eBPF

There are many applications of BPF:

Allowing non-root, user-customizable firewall:

"It is a rare situation where decades of undisciplined tinkering with Linux esoterica occasionally pay out, but this was such an occasion. Unlike in BSD, where Berkeley Packet Filter is implemented as a root-only device that attaches to entire network interfaces, on Linux it is implemented in terms of a socket optionthat usually attaches to AF_PACKET or AF_RAW sockets, however it is a little known fact you can also attach such filters to AF_INET sockets, and better yet, the ability to do so does not require root. Essentially, Linux allows non-root programs to configure their own little private firewall."

Stracing and understanding the flow of syscalls + bpf() calls + argument: complimenting stracing via BPF? (archtiecture of Netroname smart NIC)

Kernel Path understanding and tracing: (Network debugging) (Kernel analysis) (network traffic analysis)

Network packet analysis and processing (with speed):

Performance Analysis:


Containers analysis:

Intrusion Detection:

Tracing – both network and processes:

What are all the technologies/foundation which BPF used:

bpf stacks:

bpf output:

bpf + kprobes: (Intro to Kprobes)

bpf + tracepoints:

bpf + userspace tracepoints:

bpf + systemtap:

How to turn any syscall into an event: Introducing eBPF Kernel probes:

Running examples from BPF samples in linux kernel source directory

Summarizing the installation needed before running make on the kernel source:

sudo apt-get install build-essential bison flex
sudo apt-get install clang llvm
sudo apt-get install libelf-dev
sudo apt-get install elfutils-libelf-devel
sudo apt-get install clang
sudo apt-get install llc
sudo apt-get install llvm

sudo apt-key adv –keyserver –recv-keys 4052245BD4284CDD
echo “deb$(lsb_release -cs) $(lsb_release -cs) main” | sudo tee /etc/apt/sources.list.d/iovisor.list
sudo apt-get update
sudo apt-get install bcc-tools libbcc-examples linux-headers-$(uname -r)
sudo apt-get install netsniff-ng
sudo apt-get install libpci-dev

sudo apt-get install libreadline6-dev

sudo apt-get install libcap-dev
sudo apt-get install libcap-ng-dev
sudo apt-get install libmount-dev
sudo apt-get install libxen-dev

sudo apt-get install linux-headers-4.15.0-43-generic
sudo apt-get install binutils-dev
sudo apt-get install libpopt-dev

sudo apt-get install libnuma-dev libfuse-dev

sudo apt-get install netperf

sudo apt-get install gcc-multilib libc6-i386 libc6-dev-i386

cd to the linux kernel source’s sample directory and “make all”:

Testing the sample binaries as “sudo”:

image 16

Doing a “make V=1 all” let you see the details of compilation:

So this means that sockex2 is compiled from sockex2_user.c and the libbpf.a library, which I compiled earlier separately at the tools subsidrectory.

Now this is the sockex2_user.c:

And we can see that it is loading a sockex2_kern.o file (

via load_bpf_file() API, and followed by open_raw_sock() and setsockopt() to execute the kernel module.

So we can see from the sockex2_kern.c:

bpf_prog2()->flow_dissector()->which will probe into the SKB packets and identify the protocols bits to be filtered.

A key structure in this is the BPF map:

More understanding about eBPF maps, and all the bpf_XXX API will be needed.











Exploring eBPF Tracing from userspace to kernel

First can my Ubuntu 16.04’s latest bootup image be able to run BPF?

According to “uname -a” and checking the bootup config file:


And answer is yes, the kernel is able to support it.   Next is to install the userland tools:

sudo apt-key adv --keyserver --recv-keys 4052245BD4284CDD
echo "deb$(lsb_release -cs) $(lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/iovisor.list
sudo apt-get update
sudo apt-get install bcc-tools libbcc-examples linux-headers-$(uname -r)


How it worked?

Let’s take a look at a BPF sample:


First is creating the BPF map:

“a generic data structure that allows data to be passed back and forth within the kernel or between the kernel and user space. As the name “map” implies, data is stored and retrieved using a key.”

Each map is defined by four values: a type, a maximum number of elements, a value size in bytes, and a key size in bytes.

And the types of map:

  • BPF_MAP_TYPE_HASH: a hash table
  • BPF_MAP_TYPE_ARRAY: an array map, optimized for fast lookup speeds, often used for counters
  • BPF_MAP_TYPE_PROG_ARRAY: an array of file descriptors corresponding to eBPF programs; used to implement jump tables and sub-programs to handle specific packet protocols
  • BPF_MAP_TYPE_PERCPU_ARRAY: a per-CPU array, used to implement histograms of latency
  • BPF_MAP_TYPE_PERF_EVENT_ARRAY: stores pointers to struct perf_event, used to read and store perf event counters
  • BPF_MAP_TYPE_CGROUP_ARRAY: stores pointers to control groups
  • BPF_MAP_TYPE_PERCPU_HASH: a per-CPU hash table
  • BPF_MAP_TYPE_LRU_HASH: a hash table that only retains the most recently used items
  • BPF_MAP_TYPE_LRU_PERCPU_HASH: a per-CPU hash table that only retains the most recently used items
  • BPF_MAP_TYPE_LPM_TRIE: a longest-prefix match trie, good for matching IP addresses to a range
  • BPF_MAP_TYPE_STACK_TRACE: stores stack traces
  • BPF_MAP_TYPE_ARRAY_OF_MAPS: a map-in-map data structure
  • BPF_MAP_TYPE_HASH_OF_MAPS: a map-in-map data structure
  • BPF_MAP_TYPE_DEVICE_MAP: for storing and looking up network device references
  • BPF_MAP_TYPE_SOCKET_MAP: stores and looks up sockets and allows socket redirection with BPF helper functions

We have create map=>programming codes=>load_program=>pushed into kernel network socket for execution.

And this map can also be shared across different tracing event:


After creating the map is the raw BPF language programming:  bpf_insn_prog[].

Instead of coding in the raw BPF language it is also possible to have the program generated by a bpf compiler:


Or alternatively you can use bpf_asm_compile() to generate the assembly.

And then setsockopt() is used to attached the BPF program to the execution path of setsockopt().

In summary, what are all the operations you can do with bpf() APIs:

create map, lookup based on key, load BPF program, lookup and delete/updae element.


(Picture below is from:

Here is another description from userspace to kernel flow:

Some user-customizable programming codes is inserted into the kernel – to be compiled at the userspace level via LLVM and passed into the kernel (via bpf() syscall) and then verified (for code integrity) before executed inside the kernel.

Lots of sample codes available in the linux kernel source codes:

Here is the BPF language specification (for 64bit):


And for a more comprehensive list please refer to:

How BPF has been used inside the Linux kernel is documented here:

And its use for performance measurement has been used in the BCC tools:

And its use in Facebook:

Applications of BPF:

Astr0baby's not so random thoughts _____ rand() % 100;

@astr0baby on Twitter for fresh randomness


Artificial Intelligence, Deep Learning, and NLP

The Data Explorer

playing around with open data to learn some cool stuff about data analysis and the world


Data | ML | NLP | Python | R


Just a thought

IFT6266 - H2017 Deep Learning

A Graduate Course Offered at Université de Montréal

Deep Learning IFT6266-H2017 UdeM

Philippe Paradis - My solutions to the image inpainting problem


Taking the required Steps for learning


Pulkit's thoughts on the course project


Machine learning. Artificial Intelligence

the morning paper

an interesting/influential/important paper from the world of CS every weekday morning, as selected by Adrian Colyer

The Spectator

Shakir's Machine Learning Blog

Everything about Data Analytics

big data, data analytics

%d bloggers like this: