Joel Joseph

November 14, 2018

Research

Meltdown: Reading Kernel Memory from User Space

Joel Joseph

November 14, 2018

Research

Link: https://www.usenix.org/system/files/conference/usenixsecurity18/sec18-lipp.pdf

Summary:

Modern computer systems depend on the kernel being non-accessible, however this paper written by a multitude of authors and teams turns this conception on its head proving that their Attack "meltdown" can exploit side effects of out of order attacks in order to get private data. The prevalence of out of order attacks in modern systems makes this paper even more relevant, given the vulnerability exists in almost every computer in the world. Thankfully however this paper explores mitigation techniques that were developed for other reasons such as the Kaiser defense, which inadvertently can defend against these types of exploits to some degree.

What I liked:

The paper details meltdown which is a vulnerability that doesn't go after a vulnerability in software
The paper looks into mitigation techniques such as the KAISER defense mechanism for KASLR
The paper presents a really interesting end to end attack which looks at the different facets of how an attack would really happen out in the wild.
The paper goes into not only a raw attack but also talks about ways they can optimize the attack
The explanation of why Kaiser defends against aspects of Meltdown was a very interesting addition

What I didn't like:

This attack is very specific to out of order execution programs - which is starting to become a common vulnerability as we saw in the last paper
This attack doesn't work on all windows machines, only a subset of them
The mitigation techniques aren't novel ie, we've already deployed them for a different reason
The paper doesn't explain at all why this attack doesn't attack ios - does it have to do with the fact that apple builds their chips differently
When it comes to asking questions there's a lot of people and teams who worked on this paper - so it might be a challenge finding the right person to get in to contact with

Points for Discussion:

How did the discovery of KASLR differ from the discovery of Meltdown
Why is out of order prevalent in modern cores from an architecture level as opposed to older cores
What have virtual environments done in the aftermath of Meltdown disclosures in order to secure their services
Why is there a difference in the vulnerability when it is run on Linux as opposed to Windows
Has there been any documented Meltdown attacks on Android

New Ideas:

Compare the architecture of Apple chips to intel ones specifically in the context of attacks such as KASLR and Meltdown - why didn't apple fall into the same pitfall as intel
Is there an alternative way in order to segment enclave memory to prevent these attacks
Map other CPU's that might share similar designs and see if they fall into similar attacks
Study the prevalence of Meltdown attacks in android and compare to systems that had timely patches
Is there a way to attack the supervisor bit on the processor to get access to restricted areas?

Joel Joseph

November 7, 2018

Fun

First Major Bike Crash

Joel Joseph

November 7, 2018

Fun

Joel Joseph

November 7, 2018

Projects

2nd Place: Boeing Design Competition

Joel Joseph

November 7, 2018

Projects

From left to right: Jonny, Aaron, Joel, and Laurence

I had the opportunity to spend the night with some of my friends working on a design challenge presented by Boeing. At the end of the night we took second out of the dozens of other teams that presented.

As part of the challenge my team had to manage the different complexities from a cost, viability, and design stand point to put a communications drone into the sky. We then had to present our case to an audience of a couple hundred people.

The first step for our team was to understand the problem. See the information we were given in the opening presentation here.

Once we were given the problem we needed to put it into a form where we could analyze different solutions in a different way. Check out the spreadsheet our team built to figure out the best solution here. — (Like actually its a really amazing spreadsheet)

Once we understood and could quantifiably define our problem. Our team divided and conquered our objectives by playing to our respective strengths. For example Laurence, an Astronautical engineer, knew the ideal shape of our structure to make it as aerodynamic as possible. Aaron was very knowledgeable about composite materials as a Chemistry and Applied Math major, so he worked on making sure the drone met the weight specifications using cutting edge composites. And with my experience in Electrical Engineering I worked on some of the communications payload with my friend Jonathan another CS student.

Now even though I’m not an electrical engineer, some of my hobby reading and past projects exposed me to error detection and correction coding, along with compression. It was the extra components I thought to add to our communications payload that made our system so unique and placed us on the podium.

The way we sold it to the judges - Boeing employees who worked on a similar project - was that we built a project that exceeded all the project specifications (Structural, Propulsion, Etc) and proposed a software update which incorporated compression coding that would increase the data throughput of the communications payload. This would allow our client, the Federal Government, to get a lot more use out of the communications drone for a little bit higher of a price. And this update would come at virtually no cost to Boeing as a company, since it could be deployed fairly easily.

Now while we didn’t win first (the team that did were all aerospace engineers) what we did learn was how to work together in a multi-disciplinary team. Almost none of the members of our team had any skills overlap - which meant that we had to take full ownership of the piece of the project we were assigned. Moreover the experience gave me a really in depth crash course of what it is to be a Product Manager working with different teams at a highly technical level to manage many different facets of a project.

Overall I loved the experience and thought it was a really interesting problem to solve!

Check out our final Presentation here!

Joel Joseph

November 7, 2018

Research

EnclaveDB: A Secure Database using SGX

Joel Joseph

November 7, 2018

Research

Summary: EnclaveDB is a database engine that guarantees confidentiality, integrity, and freshness for data and queries even when all other actors including the database administrator is a malicious actor. The novel way this can happen is through a small trusted computing base which includes in memory storage and precompiled procedures. I think the real beauty in this technology was the fact that they were able to do it with minimal synchronization of threads - which honestly I would find a challenge to do.

What I liked:

The attacker model for this paper is amazing, the study authors managed a way to maintain the integrity of the database even when the database admin is malicious
The paper explains really well why current methods for property preserving encryption end up failing or not holding up as well as this solution
The system maintains the programming model similar to conventional relational databases so they are not reinventing the wheel.
I found it interesting the system allows for remote attestations - which seems very similar to the literature I've read on distributed ledgers and block chain systems
The protocol requires minimal synchronization between threads - which explains why the system is able to maintain freshness of data

What I didn't like:

Even though the security leaps offered by this system are huge, I think that 40% more overhead might be too much to supplant traditional database systems.
The system requires specific hardware (SGX) which means that systems need to replace their current hardware in order to use this system
I'm not sure how practical it is to assume that we can host huge amounts of memory in DRAM just because the prices of it are falling
The system uses the Intel Attestation verification service to check if a given quote has been signed by a valid attestation key --> From what I understand if this system fails it could end up compromising the entire system. So it could possibly have a single point of failure
I'm not really 100% sure how the remote challenger can establish trust in an enclave without a lot of really intensive ZKPs

Points for Discussion:

How different are these secure enclaves from the one's used by Apple?
How could someone spoof an enclave in an untrusted database
Instead of creating a specific area in an untrusted database, why not work on finding a secure data base to work on?
How is the Merkle Tree used in EnclaveDB different than the one used in BlockChains
What percentage overhead for a secure database system like this is acceptable for wide scale adoption in the enterprise community?

New Ideas:

Explore new ways to audit results from this database
Building of the distributed attestation might be really interesting because each distributed system could verify in a different way whether the enclave really is an enclave. It could also distribute the computation of a ZKP
From a hardware POV is there a way to designate secure memory only for a specific function?
How can distributed methods of memory tables be used in systems like Block Chains
Analyze the adoption of systems like these and their user groups

Joel Joseph

November 7, 2018

Research

Iron: Functional Encryption using Intel SGX

Joel Joseph

November 7, 2018

Research

Summary: This paper covers a Iron a really powerful functional encryption technique. The reason this paper is really ground breaking is the fact that it allows for a faster more practical version of functional encryption which is a landslide faster than current members. The problem with this paper that makes it hard to analyze is the fact that its built on SGX which is proprietary and as a result security researchers would not have the same access to scrutinize, compared to say open source software. The reason this technology is cool, is because it could pave the way for more sharing of data while at the same time protecting that data from misuse. Now if someone wants to have banking information on the cloud or with some entity - they can create keys that share only the data they agree to share with another entity improving overall security.

What I liked:

Functional Encryption is a technology that has a lot of potential applications if the technical and feasibility aspects of it can be worked out
This method runs functional encryption at full processor speeds which has been a challenge to accomplish in past studies
The method works well on complex functions and might even be better the more complex the function is?
The system doesn't put all its hope in the trust offered by SGX and treat it as a black box - the study looks at it from the pov that there are limitations of SGC
The study has a very good explanation of the relevant SGX knowledge needed to understand this paper.

What I didn't like:

There might be a single point of failure in the attestation system which secures enclaves
There are ways to spoof the request for keys that the paper doesn't cover
I'm not sure how realistic it is for the enclave to erase everything relevant to its state from memory
This system requires 3 different secure enclaves - is it possible to do it with 2 or less?
SGX isn't open source so it might be hard to evaluate it

Points for Discussion:

Is there room from a hardware POV for future improvements to functional encryption
20 years down the line say if the encryption encoding functional encryption is broken, is there any precautions that should be taken encoding the core data
Has there ever been a documented failure of Intel's verification attester?
How comprehensive can a security study of SGC products be given that the core tech is proprietary
Are there any other comparable technologies that have different implementations which can be used as the basis for future functional encryption.

New Ideas :

Possible audit methods for making sure the technology gives you the right answer using rudimentary systems maybe CSD?
Explore side channel attacks which might not have been covered in the paper
Develop Iron for an open source environment such as Sanctum
Encrypt data even more before putting it into functional encryption
What regulations need to be in place for the wide scale adoption of functional encryption

Joel Joseph

November 1, 2018

Fun

Foodie Adventures: Ice Cream @ Salt & Straw

Joel Joseph

November 1, 2018

Fun

Joel Joseph

October 31, 2018

Projects

SCheduler Part 1: Web Scraper

Joel Joseph

October 31, 2018

Projects

The SCheduler Team (from left to right : Justin, Jillian, Luis, Joel, and Jincheng)

This semester I am working on a web application to help USC students schedule their classes.

My job right now is to figure out a way to get course information from the USC website. The way I went about that was using python specifically the Beautiful Soup Library in order to scrape data from the USC website into our database (FireBase).

Now some challenges I ran into - the fact that USC’s website is almost entirely rendered in Javascript - ie in order to get the course times and section numbers some one has to click on the class on the website. I found a really janky work around to this which essentially meant triggering all the javascript responses on the page before scraping it.

Another challenge I ran into was actually cleaning the data and standardizing it so it could be sent to the database. The number of course sections that I scraped numbered in the high thousands - so of course there were going to be variations in the way the courses were displayed (ie Wednesday vs Wed vs W or professor names being hyperlinks to webpages or just like stray commas that would mess up the delimiter which was going through all this data). What I think made this a challenging project was the fact that one small problem that I couldn’t see - could mess up how the data got read and sent to the database. That meant that on top of random spot checks - I had to devise a way to go through all the data I had collected in the database and verify that a stray delimiter or something did not mess up which field the data.

Joel Joseph

October 31, 2018

Research Perspectives and Challenges for Bitcoin and Cryptocurrencies

Joel Joseph

October 31, 2018

Summary: This paper was a summarization of Bitcoin back in 2014 when the platform was starting to gain a lot of attention because of its spike in value. I think that overall the paper focuses too much on the technical aspect of why bitcoin gained traction instead I would have preferred a historical approach of what attitudes led to the creation of the currency. The first bitcoin transaction was signed with a complaint against the global banking order (or something like that) and the early adopters of the currency were people who didn't really have a lot of trust in a government that could print endless amounts of cash. The paper doesn't focus much on this, and as a result skips over a lot of the reasons bitcoin in my opinion became popular.

What I liked:

The paper acknowledged the evolving nature of the bitcoin system
The study was first in its field when it came out around 2014
The paper gives a good technical history of the technology that preceded Blockchains
Paper is a very broad survey of a lot of different topics in bitcoin - very good briefer
Easy to understand for a person who didn't really know much about the technology

What I didn't like:

The paper is a little bit outdated given its been a couple years
The paper doesn't really acknowledge times when miners have worked together in order to preserve the system
The paper doesn't really go into details about major problems the blockchain has faced ie one time there were two chains in the world that created a major problem
The paper doesn't seem to understand that it doesn't make sense for a person to control the network - because first its expensive to do, second if they do control the network the network becomes worthless
The paper doesn't go into much depth about a lot of the future about where the tech is going in enough depth

Points for discussion:

How open is Bitcoin to price manipulation
What role does/should ethics play for miners processing transactions on the network
Why have researchers stayed away to the field prior to this paper?
Despite no security studies how has the system remained secure
How can this forking system be replicated in order systems?

New Ideas:

Some type of 3rd party that can rate how illegal a transaction might be so it isn't transacted on the network
New ways to prevent criminals from using the network to launder money
How can bitcoin be used as a free speech platform given its immutability
Bitcoin arbitrage across different markets because of price discrepancies
Optimal time to move money, when transaction count is low

Joel Joseph

October 31, 2018

Research

Arbitrum: Scalable, private smart contracts

Joel Joseph

October 31, 2018

Research

Summary: I think that certain blockchain ideas get a little bit too much hype and this is definetly one of them. Arbitrum claims to be a step up on ethereum by offering smart contracts that have the ability to scale. But I think there are some very fundamental problems with the way this system is implemented which leaves it open to DOS attacks. Furthermore I think the system as a whole places too much power in the hands of third parties, while system like Ethereum only depend on the code. I think that this is my least favorite paper so far of the year

What I liked:

The system works through off the chain approach to figuring out how to conduct transactions -- very good because of issues with latency that plagues normal blockchains - there's a push in blockchains to move off the chain
The system for what it's worth is some what cheaper to manage because not everyone is doing the same computations
The paper claims to solve scalability issues for EVM smart contracts, its debatable whether it actually does this but the fact they are focusing in on the issue is pretty admirable.
I think there's a lot of customization in the VM options which is something you don't get when working with the standard EVM
It emulates what I think would be a reasonable human system in the 20th century - its something I could explain to my grandma decently well.

What I didn't like:

Ethereum and EOS is probably the industry standard - deviating away from what everyone else is doing doesn't really help improve security. I'd prefer if they built an over the top system
I think there's too much power given to the verifiers in this system - it’s a little bit too centralized to scale well in practicality
The verification of checking proofs every time there is a dispute could be a little bit challenging - for one who determines what's correct? IE going back to the split between Ethereum vs Ethereum Classic where both sides had a correct answer
I think it is very likely you could overwhelm a manager and make them lose out timing - the fact that this type of system needs human intervention makes me very skeptical of it
I don't like the idea of negative incentives - ie penalizing someone who challenging something. At worse I think you should be no worse than you started.

Discussion Points:

*Just as a general note -- this paper left a lot of questions in my mind which I guess will end up being solved through usage if this ever gets deployed in the real world

Under what circumstances would a malicious actor be able to take control of the smart contract
Is it possible to add new managers to the contract while the contract is already alive, ie for a company appointing a new board member
Does a manager need to be online 24/7 in order to make sure their vm is working correctly and as intended
How do you make a legal system that has no consensus method that is baked in? That just puzzles me a bit, because they claim its platform agnostic but like I think that there should be some for of defauly
Are there any ambiguous cases where a verifier could split both ways?

New Ideas:

Making a version where you take out the verifier and just have the vm decide what to do (this ends up being very similar to what we have in ethereum) -- the reason is because people are ambiguous code is not
Attack model of just creating an infinite amount of disputable assertions - I don't think this system will end up holding against it
Create a way to predict before the program is run how much the VM will spend trying to run the program
How much more efficient would this system be if the customization of VM's were stripped?
Study adoption methods within the wider smart contract community

Joel Joseph

October 24, 2018

DeepXplore: Automated Whitebox Testing of Deep Learning Systems

Joel Joseph

October 24, 2018

Summary:

This paper focused on a novel strategy for detecting safety and security concerns in places like self driving vehicles where there simply is too much data for humans to sort through manually and label. As a result there are an endless amount of test cases that just have not been accounted for. This whitebox uses similar models and essentially compares the output of the 2 models to see if there is an edge case that both models have not accounted for. Where I think this paper falls short is the fact that the whitebox doesn't work to well if you don't have a model that is similar enough to the model you want to find edge cases for.

What I liked:

This paper is really relevant to new technologies ie (self driving tech in Tesla's around us)
The white box they use is very fast and doesn't require custom hardware ie they were able to get results in one second on a commodity laptop
The paper explains why adversarial image techniques might not be the so effective in finding erroneous behaviors in deep learning models
I thought it was a really good idea to compare responses from one DL model to another to figure out edge cases
The end result of the experiment had great improvement in neuron coverage

What I didn't like:

The basis of this paper depends on having a similar DL model doing the same application - it runs into the issue of what if you have a novel application
I don't think its feasible to test every single corner case in any model, and I think that even though this does create some improvement its impossible to cover every simulation in a finite set
The computer doesn't know what the trend it is missing is ie the adversarial model proves that the computer is looking for things that might not be material to the context
The entire study runs on the idea that these platforms and models are using shared data sets which isn't really true in the private sector
The study never really explains why 3% is the max gain from their technology - I found it interesting why it was only 3%

Points for discussion:

Is 3% the size of the incorrect set that the testbox was able to catch? Where does this 3% improvement come from.
What causes the disparity between the code and neuron coverage?
What improvements can be made in order to prevent this disparity?
How practical would it be for companies to share common testing data?
How will regulation in software testing for autonomous vehicles change this field?

New Ideas:

A strategy for sharing data among competitors that doesn't expose what the specific edge case is. This could be used by regulators to prevent accidents. IE google has a model that is run on a competitors model which fails. Before the competitor deploys the product they have to find and fix that edge case
What if we used an old model as a tester to compare against a new model? Would this show the new edge cases that have been solved?
How can this DL technology be used in finding malware among normal computer software?
What ways can this technology be built into current standards for software testing
What are the limits of this corner case optimization because in theory it could be endless?