November 9, 2023 | Mason Ferrell
Whenever I’m working to solve any kind of problem, be that a security problem, programming problem, or even a car problem, I always try to work my way to the root of the issue in a principled manner. Rather than think of problems as their own individual beasts to tackle, I try to find lines of similarity that can help me apply my knowledge as broadly as possible. This approach has not only served me well in finding solutions at a faster pace than might otherwise be possible, but I’ve also found trying to think in terms of principles helps me retain information more quickly and deeply than trying to think in terms of examples. Along those lines, what follows is a set of some of the more overarching principles I try to employ when working both in the security realm (pentests, CTFs, bug bounties), and in the programming realm (especially with regards to systems/networks programming).
Principle 1: Work from the outside in
I try to follow this principle in almost everything I do, from developing, to security testing, to hobbies like playing music, martial arts, rock climbing, and working on my car. It should go without saying, but I find that people, myself included, often will forget to zoom out. For example, in college I had to write a simplified version of TCP over UDP and continually had issues with my token values to signify end of transmission, and I remember spending hours trying to fix my functionality on transmitting messages, when the bug was in a function I wrote to parse strings before transmission. Similarly, when I started doing CTFs, I would often dig down the first rabbit hole I found, such as trying to read entire source code pages for a thick client, when I could have simply done a basic enumeration of a web server, found some usernames and special strings, and use those to find the spots needed in source code. Even outside the world of digital tech, this same principle applies, such as when a friend’s car recently broke down, and he spent hours trying to troubleshoot the alternator and starter, when all he needed was to tighten the positive connector.
As cliche as it is, begin by looking at the forest before inspecting the trees individually. The saying is cliched for a reason – it’s very easy to play tricks on ourselves and get caught up in the details, but often we miss a lot of information that only becomes apparent when we look at the entire picture. Zoom out.
Principle 2: Don’t be afraid to get into the details (but get there slowly)
This one seems a bit contradictory to principle 1, but oftentimes, we need to get into the weeds. While I still would urge anyone to start out at as high a level as they can, eventually you do need to work your way in. I remember being TA for a class on network systems, and having a student come to me due to an issue in one of his protocols; it turned out the issue was in an external helper file he downloaded from GitHub; due to a versioning error, one of the tools did not work as intended. The student spent hours consulting guides on using the tool, watching tutorials, reading StackOverflow posts; however, he never actually investigated the downloaded code itself. Upon doing so, he quickly found the issue (if I recall correctly, it was that this string parser used \n instead of \r\n for line returns), and upon fixing it, his original code (which he was smart enough to back up before editing) worked perfectly.
In most situations, I’ve found the best way to find the source of an issue is to take a step back. But the best way to resolve that issue is to step back in, just step in to the right spot.
Principle 3: Be aware of caches
This one gets me all the time. Caches are in damn near everything in tech these days, and they can often mess up results. Try to pay attention to any resource that might be cached; if you can clear your caches, do so. If not, keep the fact that caching is happening in the back of your mind. This one pops up so much that even when I’m 99% sure no caches are messing with any results or data I see, I still consider the possibilities a cache might introduce. They are EVERYWHERE, so keep them in mind all the time.
Principle 4: Use automated tools when possible
If someone else already did the work, why do it again? Whenever I work with anything in codespace, I always try to find an automated tool that will save me time and energy. They don’t always work, but when they do, life gets a whole lot easier. Even when they don’t, tools can often help kickstart understanding of the underlying system by demonstrating the steps that need to be taken to use a specific API, protocol, or whatever piece of code is being used.
Principle 5: Increase verbosity on commands
How can a tool help kickstart understanding of an underlying system? One of the quickest ways I’ve found to understand how to use a system is to use an automated tool that works with that system with high verbosity (this isn’t always available, but 95% of the time I find it is). For example, I was learning to enumerate SMB and use SMBClient, and I was still in the process of learning a few commands. Man pages and whatnot exist for those commands, but sometimes the proper usage isn’t always obvious. Using an automated tool like enum4linux with high verbosity, I was able to see the way it issued its commands using SMBClient, and while the automated tool was having issues enumerating the environment, I was actually able to glean quite a bit of info using SMBClient itself. However, I was able to figure out how to use SMBClient much quicker than I would have by documentation alone because of enum4linux’s verbosity.
Principle 6: Look at the source code of automated tools
Sometimes, verbosity isn’t enough, or a tool still shows different behavior than expected, even with verbosity. Going back to principle 2, this is a great time to start digging into details. Often, automated tools contain a ton of code that is functional, and issues arise from tiny details. In these cases, we can make our own automated tools, and we can expedite this process by copying any boilerplate from already existing tools. Looking into the source code of tools can also give us a much clearer picture of how things work under the hood with a new protocol. I once had to write an HTTP server using only TCP in C early in my college career, and I barely understood HTTP. By reading source code for some open-source Python HTTP servers, I was able to understand what all had to happen to make an HTTP server work, and while there was a lot more work in rewriting this server in C, I saved probably 20-30 hours of work I would have otherwise had to carry out.
Principle 7: Fuzz everything
Whether testing a network protocol for robustness, a website for security, or a database for leakages, often we can find out a lot of information quickly by using a fuzzer to automate the process of testing different inputs into different endpoints. While our final goal is to understand what can cause errors at a system level and make sure we have a robust system, system-level details can get very confusing and many things have to be taken into account to understand all the nuances. Fuzzing won’t exactly give us a deeper understanding of a system, but when using a well-chosen wordlist, fuzzing can show us where our system breaks down much quicker than a source code review. From here, we can take the inputs that cause unexpected behavior and from there determine what’s wrong with our system; this helps point us in the right direction and avoid a lot of unnecessary fumbling.
Principle 8: Be aware of the big-endian little-endian issue
This is one of those issues that happens rarely enough that a decent number of programmers and security testers I know often forget about it, but it happens often enough and usually causes enough headaches that I try to keep it always written somewhere. Most programmers probably remember taking some systems course and losing a whole weekend to some endian-related issue, but it pops up in all sorts of other fields in tech. I’ve seen nameservice records that represent domains in both “big-endian” and “little-endian” format (e.g. google.com might be represented in one server as “google,com” and another as “com,google”), I’ve seen the issue pop up in custom filesystems, and I’ve seen the issue happen in neural network data models. While not an everyday bug, this bug pops up more often than many, and has been the source of many, many hours of banging my head into the wall.
What next?
I’ve tried to give a set of pretty overarching principles here that can apply to most IT professionals, students, and hobbyists. I’d say in my own work, these principles have helped me solve 90% of the issues I’ve encountered. However, these principles are definitely on the simpler side of things, and the real value comes from being able to do the remaining 10% these might not cover in whatever field is under question. So what now?
I would encourage any readers to develop their own principles for their own areas of work and interest as they learn, and see how those principles might come in handy later down the road. I try to keep track of important overarching concepts in all my areas of interest, organized by said fields. For example, I keep such a list of more fine-grained principles for attacking network environments; the first principle here was noted after I learned about ARP-spoofing, and has most recently helped me in carrying out an LLMNR poisoning attack, despite my unfamiliarity with LLMNR. I similarly keep separate lists for web-related concepts and mobile-related concepts. As you, the reader, work through problems in your own life, see what fundamental concepts you notice appearing regularly, and see how your noting of such concepts can come back to bless you further down the road.