Tresor-not

Tresorit_logo_zendesk2Over at the Personal Clouds list, the topic of Tresorit came up.  The service provides “completely secure cloud storage” and offers USD$10k to anyone who can hack the client-side encryption.  The word Tresor is either an encryption method designed to avoid use of RAM and implemented in the Linux kernel, the German word for vault or safe, or it’s a women’s fragrance.  After reviewing the web site, I’m not sure to which of these meanings the site name actually refers.  I suspect one of the first two was intended but the last is closer to what is described and implemented on the site.

The first issue that came up on the mailing list was that if something happens to your PC with the keys, the data is lost because the folks at Tresorit don’t hold copies of the keys and cannot recover them for you.  I actually like that Tresorit can’t recover your keys.

If I want a TNO (Trust No-One) encryption system, then it is necessary to perform all the encryption locally so that the hosting provider never has the keys.  All they see is chunks of encrypted data.  By definition, the hosting provider cannot recover your keys in this situation.  However, they can make it easier for you to recover them.  For example, most decent TNO systems generate a long symmetric key, then encrypt that with a hash of your pass phrase.  This allows you to periodically change your pass phrase without re-encrypting the entire data store (which would require rendering it in plaintext at least briefly).

Another benefit of this approach is that it’s possible to encrypt that same symmetric key using multiple pass phrases.  So you could have the one you use every day and a backup written down and stored in a safe deposit box.  Another implementation of this approach is to take that symmetric key and perform a recursive, nested encryption on it to produce a list of one-time passwords.  Each iteration of using a password reveals the key as well as the next nested object containing the key encrypted with the next password.

These approaches give the user options for storing backup copies of the passwords physically or via an escrow service.  However, there are 2 implications.  First, the hosting service cannot ever help you recover any keys or passwords for your data store.  The second is that you are the one and only natural root of trust in the system.  Either there’s one password and you risk loss of all your data; or there are copies of it and your risk of exposure to others is increased; or you delegate trust by sharing your password deliberately.  In the end with these systems, you get to pick the type of risk you are more comfortable with: loss of a single unique key; or redundant keys and increased risk of exposing them.

Unfortunately, it isn’t at all clear what approach Tresorit takes.  The web site only assures you that they don’t hold your keys.  It might be more obvious after installing the client whether you can generate multiple pass phrases, one-time passwords or other redundant key management, but after reviewing the site there was no way I was going to invest time in installing the software.  More on that to follow.  For the moment, let’s continue on with the password management issues.

There’s a distinction to be made between the password with which your keys are protected versus the password to your hosting account.  Even though your data is encrypted, it is important to keep the hosting account secure.  You don’t want random strangers to be able to overlay, delete, or spoof your data by hacking your hosting account, even if that data is encrypted and not useful to them.  Since the login password is completely unrelated to your encryption key and its password, the hosting service can usually provide some sort of password reset function for the login to their site.

But not at Tresorit.  Their FAQ states that

Without your password, you will not be able to login to your Tresorit account anymore, except if Autologin was enabled. Your login password is not stored anywhere neither by the software on your machine nor by the team of Tresorit. Not even AutoLogin stores your password in plaintext, only a one-way derivative of it. Therefore, there is no way for password recovery. AutoLogin, in this sense, is a last straw to grasp at in order to synchronize you tresors with your local hard disk. After doing so, we advise you to register again. This, however, will result in losing all your data in the cloud. No worries, though, your locally stored files will remain untouched.

And that brings up what appears to be a flaw in their design.  Explaining the problem first requires an understanding of the normal process so lets briefly go over basic password management.

When you register on a web site and set up your password, the site should NEVER store that password in such a way that the plaintext version of it can be revealed.  Obviously, that means the password cannot be stored in plaintext but what is less obvious is that it should not be stored using reversible encryption.  Why is that?

Imagine you are signing onto my web site to send me a large donation.  (OK, well that’s what I imagine.  You can just imagine any password-based sign-on.)  You send your password to the web server.  The web server now must compare the password you sent to the one in its database.

Password you sent: monkey
Password in database: 2vbv2CA*C%U|Wu.7eB{1Ur/-3

Obviously, monkey != 2vbv2CA*C%U|Wu.7eB{1Ur/-3

In order to verify your password, the web server must either translate the password you send or else reverse the encryption of the password in the database so that the two match:

monkey == monkey

or

2vbv2CA*C%U|Wu.7eB{1Ur/-3 == 2vbv2CA*C%U|Wu.7eB{1Ur/-3

If the web server is able to reverse the encryption, then a compromised web server or someone who steals the database can also reverse the encryption.  What you NEVER want to see is any situation where the password you send is compared directly to the stored value or something derived from the stored value.  So the standard approach is to use one-way encryption on the password when storing it in the database, then at login the web server uses the same algorithm on the password you provide and compares the results to the stored value.

That’s a bit of an oversimplification but if you are still with me, let’s plow ahead.

Recall that the Tresorit FAQ states:

Your login password is not stored anywhere neither by the software on your machine nor by the team of Tresorit. Not even AutoLogin stores your password in plaintext, only a one-way derivative of it. Therefore, there is no way for password recovery.

Did you spot the disconnect here?  If you type a password in, the web server is expected to transform it and compare it to the stored value.  But if the Tresorit client stores a one-way hash of the password as the auto-login credential, then it cannot possibly send the plaintext password to the server during auto-login.  So how is it the server can authenticate you with either your plaintext password or a hashed derivation of the same password?

All the possibilities that come to mind are a bit scary.  Here’s one scenario:

  • When you register you type in “monkey”.
  • The web server applies an algorithm resulting in a stored value of 2vbv2CA*C%U|Wu.7eB{1Ur/-3 for the password.
  • The client applies the same algorithm when you sign in and converts monkey to the value 2vbv2CA*C%U|Wu.7eB{1Ur/-3 locally.
  • The client then sends the computed value and, if you checked the auto-login option, stores that value locally.
  • The web server compares the value sent by the client to the one in the database and sees that 2vbv2CA*C%U|Wu.7eB{1Ur/-3 == 2vbv2CA*C%U|Wu.7eB{1Ur/-3

Whoops!  The purpose of applying the 1-way hash to the value supplied at login isn’t to conceal the human-readable string you used as your password.  It is to ensure that the value stored in the database cannot reveal the string used to log in if the database is compromised.  If the human-readable value is hashed at the client and the result directly compared to value in the database, then the computed value is the password.  Even though it’s not possible to calculate the human-readable string from which it is derived, that isn’t the point.  Anyone who breaches the database has the actual password values and can use those to log on as you.

In addition, an attacker with access to the database in this scenario can’t derive the human-readable strings but that doesn’t mean they are not vulnerable.  We know from studying data exposed in past breaches that certain passwords like “monkey” are extremely popular.  Assuming that the passwords are hashed on the client side and compared directly to the database value, an attacker can simply apply the hash algorithm to a dictionary containing the most popular passwords and then compare the results to the database.

If you put in monkey and the value 2vbv2CA*C%U|Wu.7eB{1Ur/-3 pops out the other side, then any users with a stored password of 2vbv2CA*C%U|Wu.7eB{1Ur/-3 must have used monkey as their plaintext password.  This isn’t reversing or breaking the encryption so the attacker won’t be able to get the passwords for every user entry in the database.  However about 40% of them will be easily found from a dictionary of the most common passwords.  Of those, about 30% will have reused their password across multiple sites so an attacker may have just unlocked the online banking, e-store, email or other accounts of up to 15% of the registered users of the site.

Please bear in mind that this is all conjecture.  Tresorit does not provide any of the technical details of how they manage and store login credentials and they may have invented something completely novel and effective.  Unfortunately the site lacks any technical explanation so there’s no way to tell whether this is brilliance or bullshit.

But here’s a clue.

The entire tresorit.com site is hosted on http rather than https pages.  Tresorit uses Zendesk as their Support portal.  If you re-type the URL and insert the ‘s’ into https, you can successfully initiate a secure connection.  But what you get is the Zendesk certificate which the browser flags as invalid since it’s showing up at the tressorit.com domain.  If you then add an exception to allow the mis-matched domain, the browser is immediately redirected to an http page so it isn’t possible to force the browser onto secure pages and have it work.  Finally, SSL Labs reports that the site is susceptible to the BEAST attack so that TLS security they brag about?  It can be bypassed from the client side by a well-known attack.

Arguably, your login here is to a support site which contains little personally identifiable information beyond your name and email and is therefore trivial.  However, as we know, people often reuse their passwords and so any site that accepts a login should render the login form over https, submit the login request over https and then maintain the login session exclusively over https.  This is basic web security hygiene.  The argument that it isn’t mandatory when information behind the login is public anyway neglects the password reuse issue.  Passwords are never trivial.

But let’s stipulate for the sake of argument that passwords protecting only name and email address actually are trivial.  Perhaps for example, the login credentials protecting your mailing list settings are trivial.  But this isn’t email list software.  It’s a security product.  Anything in the security category should take the position that login credentials are never trivial.  That’s part of what it takes to be a credible security product.

It gets wackier.  The Tresorit Features page brags that

Decrypting tresors without authorisation is mathematically unfeasible – it would take several lifetimes to crack one.

Files are encrypted with AES-256 before being uploaded to the cloud. Additional security is provided before upload by HMAC message authentication codes applied on SHA-512 hashes. Encrypted files are uploaded to the cloud using TLS-protected channels.

So, if I understand this correctly, Tresorit first encrypts the data in such a way that it would take “several lifetimes” to crack.  They then transmit this over TLS channels because when it comes to encryption of your data, more is apparently better.  Adding TLS to protect AES-256 encrypted data addresses the use case that we vastly expand human lifespans to hundreds of years and someone dedicates all that extra time to decrypting your files.  Glad we got that covered.

But back in the present day, your login credentials are trivial, transmitted in plaintext and not deserving of TLS.

On the web site for what is presented as an advanced security product.

Developed by people who should be aware of password reuse statistics.

And whose user base will contain at least a few people who reused their login password as their encryption password.

This is a huge credibility issue for me and the reason I declined to install and try the client.  I wouldn’t touch it with your ten-foot pole, let alone get near it personally until they fix that.  Even then, the fact that they did it at all makes me doubt their ability to assess the integrity and reliability of their design and in particular I’m skeptical of the explanation of auto-login and the inability to reset (not recover) passwords.

Good security stands up to scrutiny.  If this stuff is good, then open it to inspection, have it pen-tested aggressively, and post deep technical details on the site for those who require some proof.  Finally, take this whole “security” thing seriously, run the site over https and patch for the BEAST and other known attacks.

About T.Rob

Computer security nerd. WebSphere MQ expert. Autist. Advocate. Author. Humanist. Text-based life form. Find me on Twitter or LinkedIn.
This entry was posted in Clue train, Rant, Tech and tagged , , , , , , , , . Bookmark the permalink.

21 Responses to Tresor-not

  1. Maszek says:

    Interesting that everyone here seems concerned about encryption issues, not about geographic and legislation issues. Does it not matter where Tresorit servers are located? What laws would apply in revealing data there on to national security services?
    BTW: Allow me to point out that this blog is not hosted on https page.

    • T.Rob says:

      > Does it not matter where Tresorit servers are located? What laws would apply in revealing data there on to national security services?
      Geography matters greatly if your data is in plaintext. If the data are encrypted and only you hold the keys, it is a whole lot more important which jurisdiction *you* are in. If the servers are in the US but the service does the encryption and key management correctly, there is no benefit (with today’s technology and for the foreseeable future) for a three-letter agency in taking the server-side data. However, a little torture (since we do that now in the US) and people will give up their keys, making stealing the server-side data an exercise in futility.

      > BTW: Allow me to point out that this blog is not hosted on https page.
      I’m not sure I understand your point. Mine was that a company selling a security product based on crypto and marketed as a solution to protect user data should do basic account management and site security correctly. To do otherwise damages their credibility. On the other hand, this site doesn’t sell anything at all and doesn’t require login to comment. Are you saying that these two sites should have equal security? Or that TLS is the baseline requirement, even for sites that don’t do account management or use a 3rd party for account management?

  2. T.Rob says:

    Thanks for the in-depth cryptanalysis, guys! Good to see that Tresorit is working on the issues *and* willing to post detailed white papers on the site. If nobody minds, can I bring the conversation back to a less technical level? Is it possible Tresorit will turn on TLS for their web server? Currently the Tresorit site will run TLS but only with a cert from Azure and so fails the SSL Labs tests with an F. The Tresorit page on Zendesk at least gets a B grade but has several problems there too. There are certain categories of web site that should just run TLS all the time. Currently those include Twitter, Facebook and Google. Surely if something so trivial as micro-blogging is capable of and sees the importance of full-time TLS on all pages, then one might expect a security cloud provider to do the same, right?

    It would also be reassuring if Tresorit’s Privacy and Cookie policy link did not go to a 3rd party web site’s privacy and cookie policy. I want to know what *Tresorit* commits to do with my information, not what Zendesk does with it.

  3. orcmid says:

    @István,

    One more question. For the profile to be encrypted with a different derived key (using salt spro instead of the salt s), I assume that part of initial setup is for something like E(H(p,spro,10 000), profile)|spro to be communicated to the Tresorit server. Since the first logon from a new owner install does not know spro, this needs to come back with the encrypted profile to be used on a new installation of an owner client.

    The derived key can be other than H(p,spro,10 000), but it can only be based on what the authenticating new client knows plus what Tresorit can tell it that is not enough for Tresorit to decrypt the profile itself. It also seems to me that spro can’t be retained at the client outside of a session.

    The change-password ceremony is interesting with regard to logging onto an owner client with the new password when that client is not where the password-change was initiated.

    I suppose this is not the best place to raise these questions, but the context is here.

    • István Lám says:

      We try to do our best to reply all questions, on our website, or elsewhere. 🙂

      The encrypted profile looks like this:
      master_key = H(p,spro,10 000)
      encrypted_profile = spro | spro_hmac | E(master_key, profile) | HMAC( E(…), H(master_key,spro_hmac,2 ) )

      When you post the third message in the protocol ( R = H(nc| ns | u, H(p,s,10.000), 1) | nc ) as an HTTP GET request, the response of the server will be the encrypted_profile. All over TLS connection, of course.

      Note, that spro is independent from the salt used to derive the ‘x’ key sent to the server at registration. Which means, server does not have any information about your master_key or password.

      • orcmid says:

        Good, that confirms what I had asked about. Thanks for the encrypted_profile information.

        I want to come back to T.Rob’s analysis.

        Since the server stores u|x|s (necessary for the challenge-response protocol), if even u|x is ever disclosed (by compromise of the server), an attacker can learn s without knowing the password and can spoof the response to the challenge. (At this point the attacker can also attack the password p).

        In spoofing the response to the challenge, the attacker obtains the encrypted_profile in return. (This may be helpful if it is not know which of a set of u[I] the x is for.)

        At this point, the attacker cannot go further without knowing p, but there is significant material for now attacking p off-line and that cracks encrypted_profile.

        This is also a potential insider attack.

        [It’s amusing that finding a collision p’ on x is not necessarily a collision on master_key. I don’t see any value to that in this scenario.]

      • Szilveszter Szebeni says:

        Dear Orcmid,

        If you get access to “u” you can get access to “s” no problem (just try an authenticate with “u”). To access “x” you need to hack the server….

        If you manage to get u|x|s x=H(p,s,10 000) you can try the same offline attack as if you got access to encrypted_profile. So in essence in this case you have the same security. You do not get any advantage by spoofing the challenge response. The challenge response is intented to validate that you know “x” anyway….

        With a strong password such an attack would not be feasable.

        If your attacker model assumes that the server is also an attacker then the two situations are the same.

        In the case of an outside attacker, he has to overcome larger obstacles, like getting access to “x”.

  4. orcmid says:

    @István,

    Thank you for the links. That information is much clearer than what I found on the site previously.
    I set up my account via the installed application (which is presumably why I had no problem with the 50GB offer). I did not use my browser; I used the account creation function of the Tresorit client. Consequently, I did not see the problem with http versus https that T.Rob noticed.

    On the initial setup, I assume that the PKE key pair is generated in some manner with appropriate entropy and completely independent of u|p. The PKE pair is encrypted in a manner that can be decrypted by a different installation of the client that successfully authenticates by the procedure you show. A new client installation by the “owner” can then recover and decrypt its PKE pair from the server. The additional client on that account can now operate as the “owner,” with the same public key.

    In this setting, the owner’s “p” is the critical secret and the owner’s public key is “semi-secret,” since it is only used quietly in these protocols.

    Is that a good summary?

    • István Lám says:

      Owner’s public key can be public, but as a user, you cannot query other users’ public key currently.

      Yes, the password is critical secret in Tresorit. We are working on giving additional security in latter releases to minimize the risk of stolen passwords and stolen private keys. We will also publish it when we released it.

  5. István Lám says:

    Dear All,

    Great to see that you take care of what we designed, and understand it!

    I designed the password handling protocol in Tresorit. We just posted an official statement about this. I hope, you will find it useful.

    You can find the article: http://support.tresorit.com/entries/23577091-How-my-password-is-managed-in-Tresorit-

    • orcmid says:

      I did find this useful and have more questions to test my understanding. I apologize – I forgot to use the direct comment reply function of the blog and responded in new comments.

  6. orcmid says:

    @Jon The Tresorit account user identification (a verified e-mail address) and the password are enough to completely setup a machine that syncs (and decrypts) all of the data that the account has stored in the service. This also does not depend on any of the other setups for that account being on-line at the time. That means the e-mail address and user-chosen password are completely sufficient to produce all of the “owner” secrets that presumably never leave the user’s machine and yet are somehow shared among those different “owner” instances in some manner. As the man said, “It’s a mystery.”

    Oh, and Tresorit 0.5 provides no indication of how many instances of clients for the same account exist, so there is no indication when an imposter has started watching all of your data, including data shared with you by anyone, completely decrypted in that faux-owner’s local storage.

    For password-based security, the various mitigations such as salting and iteration increase the work factor but technically cannot improve on the entropy of the password itself (if the salt and the iteration, etc., are discoverable too). The work factor increase does not change the probability of a crack the exponential way more original indistinguishable-from-random bits do. I have seen password recovery work against PBKDF2 password based keys simply because a dictionary of known passwords and common password patterns was used. As T.Rob notes, if the work is done on the client, what goes on the wire is technically the password and what one would then attack. It might be harder to attack, but if it can be discovered there’s a shortcut entry to everything protected by it.

    The current Tresorit 0.5 beta makes it difficult to register an account with a really tough password. At least on my computer, I could only manually enter the dual entry of an initial-account password. I was unable to copy and paste from my password-generating safe, so I had to generate one that I was capable of typing. It appeared that I needed to use only A-Za-z0-9 so I needed a long one for that.

    Since the claim for Tresorit is for a TNO security level, there may be encrypted files that are quite valuable. This can justify the work a determined adversary and its cloud-sourced partners are willing to undertake. Especially for relatively long-lived data and the prospect of being able to impersonate an attractive target. The issue is not attacking other accounts, the Tresorit account may be the pot of gold.

    There is presumably more information about the client as part of the $10,000 bounty for a crack that will be posted any day now. It will be interesting.

  7. Dennis says:

    Very interesting! Is there a secure hosting service you use? Have you looked at SpiderOak – I’m interested in how good you think it is. I’ve been using it for awhile… don’t love the user experience, but not terrible.

    • T.Rob says:

      I’ve been doing the low-tech thing. I encrypt locally using TrueCrypt vaults and/or file-specific AES-256 encryption, then use cheap, insecure off-site storage. Steve Gibson has evaluated several secure, TNO-style cloud storage hosts though. Check out the Security Now archive. SpiderOak was reviewed in Episode #349. He’s covered many others as well. His assessment of most of these is about as discouraging as mine is of Tresorit. Sadly, many vendors think security is all about the crypto and that the crypto is solved. As if you just plug the components together, slap on a UI and voila! No. To do it right, it is necessary to understand the threats, the mitigations, and all the underlying abstractions, dependencies and assumptions built into the both the primitive components and the assembled architecture. And if the target market includes anyone other than casual users, for example security professionals, military, finance, etc., then it’s also necessary to expose the design to external, independent review by qualified pen testers and security pros. Part of the problem is that casual users don’t realize they need to make that review part of their purchase decision. Only when that becomes a market differentiator will companies begin to compete on their design having withstood that level of scrutiny.

  8. Jon Levell says:

    I’m a little confused by your section about login passwords. As I understand it, websites are expected to store a one-way hash of your password concatenated with a website specific “salt” value. During login the client can calculate the same hash(password+salt) and transmit that. Indeed it does make the “hash” the password for /that/ website. If the attacker has access to the backend of that website – that website is doomed.

    The point is that if someone steals the database, it is *ONLY* the password to /that/ website. Because it’s a one-way hash, it’s hard to reverse and get the plain-text password (which may be used elsewhere). The salt value should be different between websites, and means the hashes can’t be compared against “rainbow tables” of hashes of commonly used passwords unless the rainbow table is generated for that specific salt. This means a database of username + salted hashes is less use in attacking *other* websites.

    I could be misunderstanding though….

    • T.Rob says:

      Hi Jon. It is true that the hashing and salting mitigates the threat of the breached password table being used on other sites. However, that’s only one of the threats that needs to be mitigated. Ideally, the system should also mitigate the threat of an exposed password table being used to take over accounts of its own users as well. Hashing on the client side does not mitigate that threat but hashing on the server side does. In the event of a SQL injection flaw in the web site, an attacker might dump the password table but it won’t allow them to take over the account.

      Another issue with client-side hashing is that it is inherently less secure. If I do the hashing with a server-class machine then I can invest in a computationally expensive algorithm and thousands of iterations of hashing. The user sees an extra second added to their login time but to an attacker it adds trillions upon trillions of seconds to the task of computing a rainbow table. But if we compute the hash on the client side the choice of an algorithm and iteration count is affected by the capabilities of the target device. If we want it to work on underperforming netbooks or mobile devices, we’ll need to dumb down the hashing and reduce the iterations.

      Also, if the number of hash iterations is known to the client, then we’ve lost another important security factor. If everyone used exactly 4,096 hash iterations it’s still possible to compute rainbow tables, it just takes longer. But if you use 4095 and I use 4097 iterations, even if we use the same salt, that’s two completely different hash tables. If the client computes the hash, then the exact number of iterations is known to an attacker. But by computing the hash on the server side, it becomes possible to hide the exact number of iterations from the attacker, thus adding an extra hurdle to breaching the site.

      All of this comes down to cost/benefit. Are there cases in which the risk is so low that client-side hashing is acceptable? Possibly. But not if what you are selling is cloud-based encryption intended for the storage of a person’s vital records and digital identity, or that you plan to bet the business on.

      • BilllR5 says:

        @T.Rob “But if you use 4095 and I use 4097 iterations, even if we use the same salt, that’s two completely different hash tables.” But both are only only one precomputed rainbow table entry (assuming say a 5000 iteration rainbow table), as I understand it.

        Rainbow tables are a way to compress multiple iterations so you _do_not_ have a complete hash table for each iteration. The rainbow table only stores the start value and end value (hash of start value after say 5000 iterations). This is often described as the first and last links of a chain since the iterations are “chained” or nested together, e.g.,
        hash(hash(hash(hash(“monkey”+salt18))))

        Note that a rainbow table may contain all possible starting values but usually just contains some starting values (e.g., previously used passwords and likely patterns; or all combinations of lowercase; or all upper and lower letters, numbers, and top row symbols).

        The crack starts with a “stolen” stored (hopefully salted) hash value. Worst case: if I don’t know how many iterations were performed originally, then I’ll have to hash the stolen hash value up to 5000 times, checking each one against all the entries in the rainbow table. Best case: if I also know the number of iterations, then I just hash the number of times to reach the end of the chain (5000 – 4095 = 995; or 5000 – 4097 = 993) and compare that value against all the end of chain entries in the rainbow table.

        • T.Rob says:

          I don’t disagree with anything you’ve said. All I was trying to point out is that there is some benefit in the attacker not knowing the number of iterations. If the client knows the individual and global salt values, the derivation function and the number of iterations and presents the pre-computed value, the result is almost the same as not having encrypted the password in the first place. Nobody can tell the password was ‘monkey’ by reversing the encryption, but the literal string presented by the is the same as what’s in the database. The security in the system comes specifically from the configuration details to derive the encrypted value being known only to the server side. The more of these details are shared with the client, the poorer the effective security becomes, despite using the most advanced algorithms and massive hashing iterations.

    • BilllR5 says:

      “As I understand it, websites are expected to store a one-way hash of your password concatenated with a website specific “salt” value.” An individual salt for each account at that website is preferable to a sitewide “secret” salt (though a site could certainly use both).

  9. samjgarforth says:

    Great post. I wonder what the thinking is behind naming a women’s fragrance after a vault

Leave a reply to T.Rob Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.