Sunday, May 19, 2013

My Take on the City of Akron Hack

On Thursday, May 16, 2013, a Turkish hacking group called Turkish Ajan hacked into the City of Akron and released a number of files that contain personal information on a number of Akron citizens. According to the city, the attackers were able to gain access into some internal systems where they obtained tax information.

The news has died down on this for the moment, but from the information that has been released, there are some things we can infer:

1. The attackers compromised the city's public website. From the errors that were being displayed on the site, information that has been released from the city, and the way this group works, it was likely through SQL injection (although this has not been specifically stated yet).

2. The attackers compromised the city's internal systems and obtained access to tax systems. It is unknown if they were able to do from the city's public website, through the tax paying system, or some other server. In any case, this appears to be where the attackers got the files they posted.

3. Around 25K people are affected.

4. The FBI is involved and was called in quickly after the compromise was discovered. IMO, this is good.

Any additional information on what happened is pretty much speculation. Trust me, I've been speculating alot and have a pretty good idea of what happened, but I have no proof. Hopefully whoever is doing the forensics for the city will have their findings released at some point. As an Akron citizen, and tax payer, I want to know this information.

However, there is one thing that I think needs brought up. Why was this information stored unencrypted? If it was encrypted, how did the attackers obtain access to the keys to decrypt it?

The information that was released contains social security numbers of both the taxpayer and their spouse, and credit card numbers. According to PCI standards (and my understanding), the credit card numbers should have been encrypted. The federal government is required to comply with PCI, what about the city of Akron government?

As for the SSNs, I don't know of any specific regulations that requires that information to be encrypted (please let me know if there is), but I can't imagine that there is any reason it shouldn't be. I have a feeling there are at least 25,000 people who agree with me.

One final item of note. The press has been getting quotes from Deputy Mayor Rick Merolla. With all due respect sir, shut up. I can only imagine your IT and information security people are cringing whenever they read your quotes pertaining to the security of the city of Akron systems.

I'm sure you are very smart, but its obvious you are not familiar with information security. Quotes such as "Our systems are all, all our virus protection, intrusion protection systems, all of our virus software is still up to date so we are still not sure how they got in" show this. Let those performing the investigation or the talented IT personnel you employ speak on these things.

If you like, I am personally offering to give you a training course on information security, hackers, and how attacks take place. This will at least give you an idea on why the things you have been quoted as saying are cringe-worthy.

Friday, April 19, 2013

MASTIFF 0.6.0 Released!

The latest version of MASTIFF, 0.6.0, has just been released! Run over to the download site and grab the latest version!

The official changelog is located here, but the major improvements are described below.

Upgrading MASTIFF to the latest version is easy. You can follow this process:
  1. Download and install pydeep.
  2. Download MASTIFF 0.6.0 and untar it.
  3. Run "make test" to ensure you are not missing any dependencies.
  4. Run "sudo make install" to install the latest version.
  5. Copy the analysis plug-ins (the plugins directory in the tarball) to your location of choice and ensure the config file is pointing to that directory.
  6. Add any new options to your MASTIFF config file. The easiest way may be to use sdiff.

Queue

MASTIFF now has a queueing system so multiple files can be analyzed by the framework. To utilize this, give MASTIFF a directory instead of a file to analyze. It will find all files in that directory and its subdirectories, add them to the queue, and begin processing.

The queue is maintained within the MASTIFF database. So, if you have to stop MASTIFF in the middle of its run, it will begin re-processing the queue when its restarted. Some additional options have been added to allow you to work with the queue:
  • --clear-queue: This will clear the current queue.
  • --ignore-queue: This will ignore the queue and just process the file you give it.
Analysis plug-ins are also taking advantage of the queue. The pdf-parser and ZipExtract plug-ins have a new option ("feedback") which allow you to feed files from the plug-ins back into the queue for processing. For example, the ZipExtract plug-in will add all files that were extracted from the archive into the queue for processing.

Fuzzy Hashing

Fuzzy hashing is not something new within MASTIFF. However, we have changed the Python library used for it. Previously, we used pyssdeep but found that there were a number of stability issues with it on OSX and when processing large amounts of files.

Therefore, we have switched to pydeep (https://github.com/kbandla/pydeep). Our testing has shown it to be much more stable thus far.

libmagic

There was some confusion on which Python libmagic libraries to use when installing MASTIFF. To help alleviate some of that, the framework has been modified to use two different libmagic libraries:
If either library is installed, MASTIFF will utilize them.

Other Changes

A number of other bug fixes and improvements have been made. Please see the changelog file for a complete list.

As always, if you have any questions, please email mastiff-project@korelogic.com.

We have alot of great things coming down the pipe for MASTIFF, but if you have any suggestions, enhancements or plug-ins, let us know!

Thursday, February 21, 2013

MASTIFF: Automated Static Analysis Framework

Malware analysis is a process that begs to be automated. Messing up one step or running one tool incorrectly can cause you to have to restart the entire process. Fortunately, there are a number of automation frameworks or systems, such as Cuckoo or Threat Expert, that exist to help automate malware analysis.

While these automation frameworks are great, they tend to focus on dynamic analysis (behavioral analysis); static analysis (characteristic analysis) is mostly left out. The static analysis techniques that the frameworks do perform vary, but typically include hashing, strings extraction, some file-type specific tools, along with a couple other techniques. Additional static analysis programs or techniques usually have to be implemented on their own.

To do this, analysts typically create a master static analysis script that runs all of the tools desired against a file. However, if an analysis tool is run against a file type that it cannot analyze, such as a PE header analysis tool on a PDF, you run the risk of crashing the analysis program and, in turn, your automation script.

As an incident responder and malware analyst, I came up against these issues all the time, so I started to look for a solution. Nothing existed to automate the entire static analysis process and allow you to add in your own techniques.

That is why MASTIFF, an open source automated static analysis framework, was created. MASTIFF performs two functions for the analyst:
  • The file type of the file being analyzed is automatically determined.
  • Only those techniques which work on that file type are applied.
By automatically determining the file type for the analyst and ensuring that only the static analysis techniques that work on that file type are run, analysts can be assured that the risk of crashing the automated process is lessened, and that only relevant data is returned.


MASTIFF works by utilizing plug-ins for both file-type detection and static analysis techniques. The decision to utilize plug-ins was two-fold:
  • The types of files analyzed and the techniques available within MASTIFF can be easily expanded by adding new plug-ins.
  • MASTIFF is able to be "crowd-sourced".
The last reason was especially important. Anyone can create a new plug-in to add a new file type or analysis technique. As more people add plug-ins, the more useful the framework becomes. To facilitate easier plug-in development, template, or skeleton, plug-ins have been included with the project. In just a few minutes, someone can modify a few fields in the template and have a new plug-in ready to go.

In the coming weeks, I'll be posting information and tutorials related to MASTIFF, how to use it, how to create plug-ins for it, etc. Please let me know any questions you have on the framework or there is something specific that should be focused on.

Finally, I want to state that MASTIFF was funded through KoreLogic, the company I work for, and the DARPA Cyber Fast Track (CFT) program. If you are unfamiliar with CFT, I highly recommend looking at their site and submitting a proposal. Its a great program, but you only have until April 1, 2013 to do so and then no further submissions will be taken.

Tuesday, February 19, 2013

ShmooCon 2013

This past weekend I went to my first ShmooCon in Washington D.C. I have to say this was an experience that I was not expecting. I've been to many security conferences in the past, included RECon, BlackHat, GFIRST, and some SANS and OWASP conferences. ShmooCon ranks up there in the top 2 spots, if not one of the best that I've been to.

The best thing about ShmooCon is that it has a small con feel to it, while having everything the big cons have (e.g. big name speakers, contests, prizes, lots of smart people). It also has a small con price - if you can get a ticket, its only going to cost you around $150.

I was also lucky enough to be selected as a speaker this year, presenting a talk on my newly open-sourced tool MASTIFF. As a speaker, they one of the best run CFP processes I have ever used. After selection, they are constantly available for questions, have excellent moderators and are great in making sure you have what you need.

The talks at the conference were amazing. They are of the highest quality and even the ones I didn't like were full of good information. Since I was releasing MASTIFF the first day I was there, and I was freaking out about my talk (I was in the last speaking slot of the tracks), I didn't get to see all that I would have liked. However, these stood out:

  • NSM and more with Bro Network Monitor by Liam Randall - This was the best talk of the conference IMO. Liam gave an excellent talk about what Bro is, how it works, and even how easy it is to extend it. His presentation was how all presentations should be - easy to follow and good at explaining a relatively complicated concept.
  • Crypto: You're doing it wrong by Ron Bowes -  Ron gave an excellent talk about some crypto attacks, how they can be performed, and even did 3 live demos (that didn't fail) that performed these attacks. I'm not a crypto guy, but Ron's explanations of everything were easy to follow and entertaining. Plus he used The Call of Cthulhu as some of his encrypted text.
There were alot more that I saw that were excellent, and some that I unfortunately missed. Luckily, ShmooCon makes all their recordings available online for free and should be up in a couple of weeks. I look forward to next year!

Friday, November 9, 2012

2008 Malware Challenge


In 2008, Greg Feezel and I published the following malware analysis challenge. The goal was to answer the questions below and submit them back to us for prizes. While the challenge is no longer going on, we wanted to publish it again so those that wished to try it could.

The malware is contained within a password protected zip file named malware.zip. The password is “infected”. The MD5 hash of the files are:
  • 59a95f668e1bd00f30fe8c99af675691 malware.exe
  • 31d2ec3b312d0fd27940aae5c89e3787 malware.zip

Situation:

A system administrator within your organization has come to you because a user's PC was infected with malware. Unfortunately, anti-virus is unable to remove the malware. However, the administrator was able to recover the suspected malware executable. Your job is to analyze the malware.
Participants should download the malware sample and analyze it. The end result should be a document containing details on the analysis performed. The analysis document can be written in any form, but the following questions and statements should be answered within it. Participants should note when questions are being answered.
  • Describe your malware lab.
  • What information can you gather about the malware without executing it?
  • Is the malware packed? If so, how did you determine what it was?
  • Describe the malware's behavior. In other words - what files does it drop, what registry keys does it modify, what network connections does it create, how does it auto-start, etc?
  • What type of command and control server does the malware use? Describe the server and interface this malware uses.
  • What commands are present within the malware and what do they do? If possible, take control of the malware and run some of these commands, documenting how you did it.
  • How would you classify this malware? Why?
  • What do you think the purpose of this malware is?
Bonus questions:
  • Is it possible to find the malware's source code? If so, how did you do it?
  • How would you write a custom detection and removal tool to determine if the malware is present on the system and remove it?

Blog Post Down

Yesterday I published a post on the 2008 malware challenge that I helped put together and how I felt it was being mis-represented in another security company's (pay for) CTF.

The person responsible for that CTF posted a comment on the blog and asked me to contact him, stating it was really a mistake and no ill-intent was involved. I believe him.

The security industry we work in is very small. If your integrity is besmirched* then that can have negative effects on your career or company. I would not want to be responsible for that in the case of a simple oversight.

That is why I removed the blog post. In all fairness, I should have contacted them first before posting anything.

I am still posting the malware challenge and will do so later today.



* Woohoo! I got to use besmirched in a blog post!

Friday, November 2, 2012

NEOISF Puzzle Solution

A few people emailed me with the solution to the puzzle I posted, but I figured I'd post the solution for those that wanted it.

In the puzzle, Van Helsing is attempting to break the crypto that Dracula is using to try and find him. Fortunately for Van Helsing, the program is free and he can download it to see if he can crack it. He ran the program and typed in "vampire_vampire_vampire" and got back "R1lUR1hKXGhHWVRHWEpcaEdZVEdYSlw=". 

Anyone who has done any type of network analysis, or looked at a raw SMTP message, should recognize the output as base64 encoded. Base64 is an algorithm that converts binary data to ASCII so it can be transferred over protocols that do not natively allow binary (e.g. SMTP). It does this by converting every 3 bytes of data to 4 bytes of ASCII. The "=" character is used as padding in case more characters are needed and is often a give-away.

Base64 can be converted using many methods, but since Van Helsing is awesome he is using Linux and uses the base64 command to do so.

$ echo -n R1lUR1hKXGhHWVRHWEpcaEdZVEdYSlw= | base64 -d -
GYTGXJ\hGYTGXJ\hGYTGXJ\

NOTE: Van Helsing really should have redirected the output to a file since the characters could have been binary.

The base64 decoding produced a string that has 2 interesting qualities.

First, the base64 decoded string is the same length as the string he entered. This means that whatever algorithm the encryption program is using may be doing a 1-for-1 character encryption. In other words, the characters in his plaintext is being encrypted one at a time.

Second, there is a pattern of "GYTGXJ\h". The pattern is 8 characters long, which just happens to be the length of "vampire_". Coincidence? Probably not. 

The type of encryption that immediately popped into Van Helsing's head that can have these properties is XOR encryption. XOR is a boolean logic function that can be applied in encryption. This is done by taking a key and XOR'ing each of its bytes against the characters in the plaintext. 

One property of XOR encryption is that if you take the plaintext and XOR it with the ciphertext, it will reveal the key! Van Helsing knew this and XOR'd his plaintext against the ciphertext he got. (He wrote a quick Python script to do so):

$ python xordecode.py GYTGXJ\hGYTGXJ\hGYTGXJ\ vampire_vampire_vampire

18971897189718971897189

Voila! XOR'ing each byte of his plaintext with the ciphertext he received returned a pattern of "1897", which must be the key!

Taking that as the key, he then base64 decoded Dracula's message and applied the key of 1897 to get:

I will be at the Ohio Information Security Summit.

Now Van Helsing knew where he would be and could destroy the fiend!

For those in the know, the key does have some significance. :)