M-A's

technology blog

Wednesday, 13 February 2013

Exception handling and non-critical components


Most software developers consider exception handling is different than error handling but should these be handled differently? Wikipedia has a full page on exception handling.

Exception handling in theory

For example let's take memcached clients across various languages. Let's speculate that an error could be the fact that a key is not in memcache and an exception could be that the client failed to connect to the server. Note that it's the library that decides what is an exception or an error and different libraries disagree with each other. It becomes even more confusing in languages like C++ where there isn't a common idiom on where the separation line should lie.

Exception handling in practice

Let's use the 4 partial snippets below to describe the difference in practice. While I'm referring to the AppEngine documentation, it's by pure laziness and this post has nothing to do specifically with AppEngine and has only little to do with memcache client libraries themselves.

It's all about exception and error handling idioms in each language:

C (ref)

memcached_st* memc = memcached(...);
size_t length = 0;
uint32_t flags = 0;
char* value = memcached_get(memc, "item", 4, &length, &flags, &error);
if (value == NULL) {
  // Regen the value.
}

Java (ref)

import com.google.appengine.api.memcache
MemcacheService syncCache = MemcacheServiceFactory.getMemcacheService();
byte[] value = syncCache.get("item");
if (value == null) {
  // Regen the value.
}

python (ref)

from google.appengine.api import memcache
value = memcache.get("item")
if not value:
  # Regen the value.

Go (ref)

import "appengine/memcache"
if item, err := memcache.Get(c, "lyric"); err != nil {
  // Regen the value.
}

Exception handling and components

Can you spot the errors above? To help you, read this post about the chaos monkey
http://techblog.netflix.com/2012/07/chaos-monkey-released-into-wild.html
and figure out what would happen with each of the 4 clients above if a chaos monkey decided to take a memcached server down. In particular, spot the client library design error versus the necessary components to reply to an HTTP request.

It's simple, memcache is not a necessary component in a http request success. In practice, any request to the web server must always succeed in either of the languages above even if memcached is not available. But it won't in 2 out of 4 because they handle exceptions differently from errors. In particular, for an optional module, this turns an innocuous error into an HTTP 500! The fix is to rewrite these as the following:

Java

import com.google.appengine.api.memcache
byte[] value = null;
try {
  MemcacheService syncCache = MemcacheServiceFactory.getMemcacheService();
  value = syncCache.get("item");
} catch ( <Figure out which exception to catch> ) {
  // Nothing to do.
}
if (value == null) {
  // Regen the value.
}

python

from google.appengine.api import memcache
value = None
try:
  value = memcache.get("item")
except <Figure out which exception to catch>:
  # Nothing to do.
  pass
if not value:
  # Regen the value.
At first look, it turns the shortest snippets into horribly long sequences of code for non-important code paths. But there's more issues in there.

Handle a runtime exception, or not

Let's focus on deciding which exception to catch. In particular, this has to be decided at each call site and it's very easy to catch different exceptions at different call site if one is not careful. To make things worse, the exhaustive list of runtime exceptions that can be raised is not documented! In practice, one has to see the exception happening to be able to figure out which exception to handle. But is not always easy in practice to "kill the memcached server and see what happens".

Preemptive handling

A junior developer would be quickly tempted to put a catch all handler like "} catch (Exception e) {" or "except Exception:". It is worse because then you'd catch exceptions like DeadlineExceededException or DeadlineExceededError, which could make an HTTP that could have possibly returned into one that has no chance of completing. https://developers.google.com/appengine/articles/deadlineexceedederrors

Exception documentation

Java tries to solve this problem with checked exception signature but the core problem with exception signature in Java is that it forces the developer to handle the exception he probably doesn't care about, the one explicitly listed, and simultaneously fails to document or help the developer handle the runtime exceptions, which is the one he cares about! It's even worse in Oracle's Unchecked Exceptions — The Controversy, the author uses the following falacy:
Runtime exceptions represent problems that are the result of a programming problem, and as such, the API client code cannot reasonably be expected to recover from them or to handle them in any way.
While inside the memcache client library, a component, the fact that a network connection fails may be considered an exception. But this thinking assumes all the components are necessary to achieve work item completion which, as demonstrated above, is not true, especially in distributed systems. Another example is dynamic content with static content fallback, which netflix also implements. So the hypothesis of linear control flow in the unchecked exception doesn't hold; in modern system, the control flow can adapt to "exceptional" runtime situations and more often than not, doesn't care if a value wasn't in the key store or if the whole db is temporarily unavailable.

Throw or return null?

So next time someone asks to be able to use exceptions in C++, just say no. If you are stuck with Java or python for web development, assume unplanned downtime. If you are starting a new project, strongly consider a language not using exceptions like Golang or C or in the case of C++, with libraries not using exceptions. Personally, I'm totally sold to Golang because exception safe code is shorter.

Thursday, 15 November 2012

Unicode equivalence may not be handled as you think

Unicode normalization is not always happening how you would expect, especially w.r.t. file systems. First, I recommend you to read about it on the wikipedia page http://en.wikipedia.org/wiki/Unicode_equivalence that is fairly well written:
In general, the code points of truly identical characters (which can be rendered in the same way in Unicode fonts) are defined to be canonically equivalent.
Unicode has 2 equivalence notions, with pre-composed or decomposed representing the same characters, and 2 normal forms, the canonical one (NF) and the "compatible" one (NFK).
In order to compare or search Unicode strings, software can use either composed or decomposed forms; this choice does not matter as long as it is the same for all strings involved in a search, comparison, etc. On the other hand, the choice of equivalence criteria can affect search results. For instance some typographic ligatures like U+FB03 (ffi), roman numerals like U+2168 (Ⅸ) and even subscripts and superscripts, e.g. U+2075 (⁵) have their own Unicode code points. Canonical normalization (NF) does not affect any of these, but compatibility normalization (NFK) will decompose the ffi ligature into the constituent letters, so a search for U+0066 (f) as substring would succeed in an NFKC normalization of U+FB03 but not in NFC normalization of U+FB03. Likewise when searching for the Latin letter I (U+0049) in the precomposed Roman Numeral Ⅸ (U+2168). Similarly the superscript "⁵" (U+2075) is transformed to "5" (U+0035) by compatibility mapping.
I found out while writing an universal file tracer for the Chromium project. [Spoiler alert]: The gory details are buried in trace_inputs.py. Note that the code in trace_inputs.py also does case normalization, which a subject in itself, maybe for another post.

I wasn't sure about each OS behaviour with regard to file path handling so I wrote a small python script to figure out what is happening exactly. I pasted the script's code at the bottom of this post. I'll let you guess what happens on each of the following OS: OSX 10.8, Ubuntu 12.04 with LANG=foo.UTF-8 and Windows 7. The analysis is under the eye of if NF or NFK is employed when trying to open a file. I'm explicitly excluding case normalization (a vs A) for this post.

Ubuntu

Let's start with Ubuntu, which behaved exactly as I imagined. Note that I'm using LANG=foo.UTF-8:
~/src/foo> ./unicode_is_hard.py
e-acute-circumflex
Found 2 different encodings for u'\u1ebf'
  NFKC: u'\u1ebf'
   NFD: u'e\u0302\u0301'
   NFC: u'\u1ebf'
  NFKD: u'e\u0302\u0301'
  OS returned: u'NFC\u1ebf', u'NFDe\u0302\u0301', u'NFKC\u1ebf', u'NFKDe\u0302\u0301'

roman_numeral_one
Found 2 different encodings for u'\u2160'
  NFKC: u'I'
   NFD: u'\u2160'
   NFC: u'\u2160'
  NFKD: u'I'
  OS returned: u'NFC\u2160', u'NFD\u2160', u'NFKCI', u'NFKDI'

e-acute-circumflex + roman_numeral_one
Found 4 different encodings for u'\u1ebf\u2160'
  NFKC: u'\u1ebfI'
   NFD: u'e\u0302\u0301\u2160'
   NFC: u'\u1ebf\u2160'
  NFKD: u'e\u0302\u0301I'
  OS returned: u'NFC\u1ebf\u2160', u'NFDe\u0302\u0301\u2160', u'NFKC\u1ebfI', u'NFKDe\u0302\u0301I'
How Nautilus displays the files
As you can see, the file system is not processing the Unicode characters at all so what you write is what you get. Now I'll let you guess what happens on OSX and Windows. Prepare your bets.

Windows

Windows is interesting because it didn't behave as I expected.
D:\src>python unicode_is_hard.py
e-acute-circumflex
Found 2 different encodings for u'\u1ebf'
  NFKC: u'\u1ebf'
   NFD: u'e\u0302\u0301'
   NFC: u'\u1ebf'
  NFKD: u'e\u0302\u0301'
  OS returned: u'NFC\u1ebf', u'NFDe\u0302\u0301', u'NFKC\u1ebf', u'NFKDe\u0302\u0301'

roman_numeral_one
Found 2 different encodings for u'\u2160'
  NFKC: u'I'
   NFD: u'\u2160'
   NFC: u'\u2160'
  NFKD: u'I'
  OS returned: u'NFC\u2160', u'NFD\u2160', u'NFKCI', u'NFKDI'

e-acute-circumflex + roman_numeral_one
Found 4 different encodings for u'\u1ebf\u2160'
  NFKC: u'\u1ebfI'
   NFD: u'e\u0302\u0301\u2160'
   NFC: u'\u1ebf\u2160'
  NFKD: u'e\u0302\u0301I'
  OS returned: u'NFC\u1ebf\u2160', u'NFDe\u0302\u0301\u2160', u'NFKC\u1ebfI', u'NFKDe\u0302\u0301I'

How Windows Explorer displays the files
As you can see, and that was unexpected to me, Windows doesn't normalize the unicode code points to NFK so you will get whatever the program used like for Ubuntu. As a spoiler, cygwin is doing the same but I left its output for brevity. Note how the rendering is significantly different for \u2160 (I) unlike Ubuntu's default rendering in Unity.

OSX

If you already played with unicode code point normalization and had to touch OSX, you problaby know why I kept it as the last one:
~/src/foo> ./unicode_is_hard.py
e-acute-circumflex
Found 2 different encodings for u'\u1ebf'
  NFKC: u'\u1ebf'
   NFD: u'e\u0302\u0301'
   NFC: u'\u1ebf'
  NFKD: u'e\u0302\u0301'
  OS returned: u'NFCe\u0302\u0301', u'NFDe\u0302\u0301', u'NFKCe\u0302\u0301', u'NFKDe\u0302\u0301'
  2 are not matching.
  For  NFC, expected  NFC, NFKC but could with  NFC,  NFD, NFKC, NFKD
  For NFKC, expected  NFC, NFKC but could with  NFC,  NFD, NFKC, NFKD
  For  NFD, expected  NFD, NFKD but could with  NFC,  NFD, NFKC, NFKD
  For NFKD, expected  NFD, NFKD but could with  NFC,  NFD, NFKC, NFKD

roman_numeral_one
Found 2 different encodings for u'\u2160'
  NFKC: u'I'
   NFD: u'\u2160'
   NFC: u'\u2160'
  NFKD: u'I'
  OS returned: u'NFC\u2160', u'NFD\u2160', u'NFKCI', u'NFKDI'

e-acute-circumflex + roman_numeral_one
Found 4 different encodings for u'\u1ebf\u2160'
  NFKC: u'\u1ebfI'
   NFD: u'e\u0302\u0301\u2160'
   NFC: u'\u1ebf\u2160'
  NFKD: u'e\u0302\u0301I'
  OS returned: u'NFCe\u0302\u0301\u2160', u'NFDe\u0302\u0301\u2160', u'NFKCe\u0302\u0301I', u'NFKDe\u0302\u0301I'
  2 are not matching.
  For  NFC, expected  NFC but could with  NFC,  NFD
  For NFKC, expected NFKC but could with NFKC, NFKD
  For  NFD, expected  NFD but could with  NFC,  NFD
  For NFKD, expected NFKD but could with NFKC, NFKD
How Finder displays the files
As you can see, OSX is the only OS to normalize Unicode code points. But it is doing partial normalization, only for NFD vs NFC but not for NFKx vs NFx. That's interesting as I'd have expected NFK handling instead. So a file written in NFKx cannot be opened in NFx but NFC vs NFD is transparently converted.

The code

#!/usr/bin/env python
# Copyright (c) 2012 Marc-Antoine Ruel. All rights reserved.

"""This scripts create a subdirectory named unicode_is_hard which contains
various files in various encoding.

See http://en.wikipedia.org/wiki/Unicode_equivalence for the various UTF
encodings.
"""

import os
import shutil
import sys
import unicodedata

BASE_DIR = os.path.dirname(os.path.abspath(__file__))

def try_with_string(work_dir, unicode_string):
  """Encodes an unicode string with 4 different encodings and tries to open the
  file with the other encodings.
  """
  # Delete the work directory if present.
  if os.path.isdir(work_dir):
    shutil.rmtree(work_dir)
  os.mkdir(work_dir)

  encodings = (u'NFC', u'NFKC', u'NFD', u'NFKD')
  encoded = dict(
      (key, unicodedata.normalize(key, unicode_string)) for key in encodings)
  filenames = dict((key, key + value) for key, value in encoded.iteritems())

  # This implicitly assumes python does the right thing here.
  different_encodings = len(set(encoded.itervalues()))
  print(
      'Found %d different encodings for %r' %
      (different_encodings, unicode_string))
  for encoding, value in encoded.iteritems():
    print('  %4s: %r' % (encoding, value))

  # Now for each type, create a file. See if the other encodings can open it.
  for filename in filenames.itervalues():
    open(os.path.join(work_dir, filename), 'w').close()

  files_found = sorted(os.listdir(work_dir))
  print('  OS returned: %s' % ', '.join(repr(i) for i in files_found))
  not_matching = set(filenames.itervalues()).difference(files_found)
  if not_matching:
    print('  %d are not matching.' % len(not_matching))

  expected = {}
  for encoding, value in encoded.iteritems():
    # Assumes comparison in python is correctly done.
    for encoding_to_confirm, value_to_confirm in encoded.iteritems():
      if value_to_confirm == value:
        expected.setdefault(encoding, []).append(encoding_to_confirm)

  # Now do the 16 combinations to try to open each files with the other
  # encoding.
  actual = {}
  for encoding, original_filename in filenames.iteritems():
    for encoding_to_try, value_to_try in encoded.iteritems():
      # Try to open the file with the other encoding.
      try:
        open(os.path.join(work_dir, encoding + value_to_try)).close()
        actual.setdefault(encoding, []).append(encoding_to_try)
      except IOError:
        pass

  # Print if anything unexpected succeeded. This happens in the case
  # encoded[encoding1] != encoded[encoding2] but they could open each other.
  for encoding in encodings:
    if sorted(expected[encoding]) != sorted(actual[encoding]):
      print(
          '  For %4s, expected %s but could with %s' % (
            encoding,
            ', '.join('%4s' % i for i in sorted(expected[encoding])),
            ', '.join('%4s' % i for i in sorted(actual[encoding]))))

def main():
  work_dir = os.path.join(unicode(BASE_DIR), u'unicode_is_hard')

  # Examples taken from the Wikipedia page and unicodedata python stdlib doc.
  # http://docs.python.org/2/library/unicodedata.html
  e_acute_circumflex = u'\u1ebf'
  roman_numeral_one = u'\u2160'

  print('e-acute-circumflex')
  try_with_string(work_dir, e_acute_circumflex)

  print('\nroman_numeral_one')
  try_with_string(work_dir, roman_numeral_one)

  print('\ne-acute-circumflex + roman_numeral_one')
  try_with_string(work_dir, e_acute_circumflex + roman_numeral_one)
  return 0

if __name__ == '__main__':
  sys.exit(main())

Friday, 12 October 2012

Short 10 items work-efficiency recipe

Here's a repost of a message I wrote internally at Google. I had been asked about how to be more efficient, or put another way, how to generate that much code. To get an idea, you can look at the data there;
http://svnsearch.org/svnsearch/repos/CHROMIUM/search?view=plot&author=maruel%40chromium.org. In that time frame, I also contributed to buildbot, Rietveld, and worked on Google-internal projects.

So here's my short 10 items work-efficiency recipe;

1. Always keep the same work schedule as much as possible

But work when your brain is in flux state. If you wake up a noon and get to bed at 3am everyday, keep it always the same. When you're 25yr old it's fine to be less stable in your work schedule. You'll get old eventually, if you survive yourself, and eventually, your body will hate what you do to it. Work on weekend if needed but keeping a stable schedule is important for maximal brain efficiency. Continue coding up to the exact moment as soon as you see yourself unsure of the design for your next line to write, stop coding at that point.

2. Do small changes

Other committers have probably larger diffstat than me but the CLs are more complicated so harder to read. I try to make small CLs because it's:
  • Easier to glance at to figure out what's it's doing.
  • Much easier to review, reducing turn around time -> enable review over email -> improve your own efficiency.
  • Easier to revert with less chance of merge error.
  • When doing small changes, it's possible to TBR= the patches more often. TBR in this context means to be reviewed.

3. Learn to cope with review latency

When doing small changes, you can pipeline them to reduce the effect of review latency. You can cheat sometimes with TBR but not abuse too much. Working on 2 projects concurrently helps a lot. I often start with a large change then split it up into smaller CLs. This always improve the quality of the code.

4. Take time to pay technical debt

Probably worth keeping aside 20% of your coding time to technical debt;
  • Adding tests. In spring 2011, I took 3 months writing unit tests for depot_tools. It was really depressing but it really helped afterward.
  • Refactoring poor designs. It's good to accumulate technical debt since it's often after the fact that you can really see the best design. Do not try to design too much up front, unless you're designing an API!
Often people are afraid to refactor because of the cost of doing so. Planing is the key. Split in sub tasks;
  1. Identify consumers.
  2. Identify problem and how a new interface would fix the problem.
  3. Evaluate the cost/benefit of a refactor. Think about intangibles, would it reduce the learning curve of a potential new contributor?
  4. Create the new interface.
  5. Write tests for the new interface.
  6. Alert everyone.
  7. Switch consumer to new interface.
  8. Wait for propagation delay.
  9. Remove old interface.
It applies mostly everywhere. It requires being methodic. But sometimes, give up, the refactor is not worth it! A refactor for the fun of refactoring is skipping the "Identify problem" step. See next item.

5. Focus on your user's benefit

Do not focus code or just yet-another-feature. It's not the number of commits or the diffstat, it's stuff that works that count. Do not fix problems for old code, do not be afraid to deprecate cleanly. Work on complex problems! Fix a complex problem with many simple solutions by splitting the problem in parts so as much existing components can be reused. Work most of the time on non-visible grungy stuff but occasionally work on highly visible projects otherwise you'll get no recognition.

6. Fix repetition with code

Kill idle time with code. Take the time to automate anything you see yourself doing 3 times. Write one-liner scripts and put them in a SCM. Separate your public scripts from the private ones. For example, this permits putting the public one on github.

7. Write the code to be refactorable in the first place

This is in general overlooked by new grads but it is extremely important. Someone will be stuck with the code you wrote 4 years from now and will hate you and will wonder why you did it this way. So at least, make it easy for them to refactor it.

That's why I always align the function call arguments at +4 on the following line, so that a single argument addition is a single-line diff that is very easy to revert or merge with other commits. Never align at "(", otherwise at the moment you are renaming the function, you have to realign all the call sites!

Another example is to use style check or static analysis.

8. Abuse to some extent the "test on prod" mentality

To be able to achieve that, you have to:
  • Write code defensively. Especially with python, springle asserts generously.
  • Plan for failure. If everything breaks, what is the cascading effect? Plan for cascading failure. For example, a gclient sync [The chromium meta checkout tool] breakage could DDoS the subversion server, then provision accordinly.
  • Have breakage not be too important -> do many incremental changes instead of big ones.
  • Make sure it's easy to revert fast (small CLs).
  • Have some sort of monitoring. Devs yelling at you is a form of monitoring. Otherwise, it's time to pay technical debt. I abuse 'devs monitoring' a bit. Try to do without pissing off your colleagues too much.
  • Unit tests are great. You need test. But you need integration (smoke) tests too. Are your mocks representative of the actual implementation? If the component you rely on working with your use case?

9. Optimize your work environment

  • Have your text editor be efficient. I personally use vim exclusively even if I do not consider myself a power-user. Spend an inordinate amount of time configuring it. Try a few before settling in.
  • Use the CLI all the time.
  • Try to never touch the mouse. But still use an high quality mouse.
  • Use an high quality keyboard. Grab a keyboard where the F-keys are near the numbers row if you use F-keys. Millimeters count.
  • Take time to learn how to use your SCM and review tool. As an example in Chromium-land, commands like "git cl comments" help boost your productivity.
  • Not using GUI makes it easier to effectively use any wait time I may have; grab laptop, fire up an ssh window and screen -x exactly where I had left up. Setup ssh keys to reduce wait time. That's to help with #1.
  • Do not be lazy. Use the best tools available. There are awesome engineers in the world produce new tools that could be of use for you, use their output. So the list of tools is different from last year's; be prepared for change. For example, "If you are not using ninja to build Chromium, You are compiling it wrong(tm)". Do not accept status-quo for toolsets.

10. Optimize your meta-work environment

  • Do not get distracted. Social events are great. Visit other offices if you work in a multi-office environment. Meeting colleagues face to face is extremely important to build trust relationships. Otherwise, join meets-up to learn about how other companies solve common problems. But most of your time should be spent coding if you are a SWE.
  • Communicate asynchronously as much as possible. But when it's time for coordination, communicate synchronously. VC/IM/F2F.
  • Do not be shy. You are not paid to be shy. It doesn't mean to be a jerk, just not be afraid to ask questions. Be prepared to receive RTFM as answer.
  • Do not meta-work. Gmail filter out as much as possible. Force yourself to use keyboard shortcuts in Gmail. Do not spend as much time on G+ as I do. :) Meetings are meta-work. Meta-work is your #1 enemy.
  • Reduce communication overhead as much as possible. Use broadcast instead of 1:1 to spread information. Use mailing lists instead of direct email addresses for easier searchability and archival.
  • If you do not like working with someone, do not work with the person. Do not let management overhead kill your productivity.

Friday, 15 July 2011

Want to rent a movie tonight? Can you calculate how much it'll cost you?

Or how many movies can you rent on iTunes in a month?

For demonstration purpose, let's say you love "Funkytown", you are silly and you want to rent it multiple times within a month. Its HD version is 4.6gb at a price of 6.99$ on the iPad. For consistency, I'm taking ISP's cheapest package above 5mbps and assuming you have another service with the ISP to have reduced cost.

ISP Province Monthly price BandwidthAllowanceExtra
Vidéotron Québec 43,95$8mbps50gb4.50$/gb max 50$
RogersOntario46,99$10mbps60gb2.00$/gb max 50$
Bell Québec 42,95$7mbps60gb2.50$/gb unlimited cost?
Bell Ontario 43,90$6mbps25gb2.50$/gb unlimited cost?

With most ISP, you would be able to rent up to 13 movies in a month, if you are not ever watching Youtube videos at 135mb/hour, going to tou.tv, doing Skype or Hangout on Google+ at 720mb/hour. And don't ever think about installing your latest operating system service pack, for each of your computers and laptops, which sometime weights near a gigabyte.

If you have a family, think teenagers watching Bieber in a loop and blow up the monthly cap, you'll end up renting the 11th movie on Vidéotron at an effective cost of 6.99$+4.6gb*4.50$/gb = 27.69$. Yes, it's ridiculous.

I do welcome the extra bandwidth cost bounding. I think it puts a fair balance between limiting heavy usage and extortion. But the extra bandwidth cost is usually unbounded for business accounts, like mine. This puts small businesses in an even weaker position, as they can't afford to not have internet access and usually have multiple concurrent users on a single connection. In fact, small businesses are the ones that are losing the most of this situation.

Now think about it, most independent ISP have caps around 200gb, which would permit you to rent 43 movies in a month, which makes more sense as an upper limit.

Network bandwidth is not like water or electricity; an idle router and a congested router have both the exact same cost. As a counter point, a congested router will have lower throughput than a non-congested one so there is need to balance usage. It's fair, we don't want to have too many congested routers, causing slow connections. My point is that having unbounded extra cost, especially above 1.00$/gb, is nearing extortion and in particular for small businesses.

References
Vidéotron
http://www.videotron.com/service/internet-services/internet-access/high-speed-internet

Rogers
(note how the details are hidden in a faq on a almost unbranded site)
http://www.rogers.com/web/link/hispeedBrowseFlowDefaultPlans
http://www.keepingpace.ca/faq.html#9

Bell
http://www.bell.ca/shopping/en_CA_QC.Performance/DSLTIPQCNewMassNCQPF06.details
http://www.bell.ca/shopping/en_CA_ON.Performance/DSLTIPONNewMassNCOPF10.details

Disclaimer
I work for Google but I did this research on my own time. It doesn't represent the opinion on my employer. I pay for my extra bandwidth.

Thursday, 14 April 2011

Putty configuration

Saving my preference here since I always forget:

  • Session
    • Close window on exit: Always
  • Terminal
    • Bell
      • Taskbar/caption indication on bell: Steady
    • Features
      • Disable remote-controlled window title changing: True
  • Window
    • Lines of scollback: 2000
    • Behaviour
      • Window title: <session name>
      • Separate window and icon titles: True
      • Warn before closing window: False
    • Translation
      • UTF-8
    • Colours
      • ANSI Blue: 0, 0, 242
      • ANSI Blue Bold: 132, 132, 255
  • Connection
    • SSH
      • Remote commands: "screen -x"
      • Preferred SSH protocol version: 2 only
      • Encryption cipher selection policy: Move up "--warn below here --" to only leave "AES (SSH-2 only) enabled.
      • Tunnels
        • <Set relevant tunnels>
Set as startup program: "...\pageant.exe ...\<private key>.ppk"

Friday, 11 March 2011

Generating passwords

Note to myself as I always forget. How to generate a (mostly) uncrackable password:
sudo apt-get install apg
apg -m 9 -MLNS -a0 -t
This request: min 9 chars, must contain lowercase,  numeral, and symbol, be pronounceable, and print the pronunciation.

Then,

  • Prepend /! for irc&bash safety.
  • Append any accented letter in (non-exclusive) çÇ àÀ­áÁäÄâ éÉèÈëËêÊ íÍìÌïÏîÎ óÓòÒöÖôÔ úÚùÙüÜûÛ ýÝÿ ±£¢¤¬¦²³¼½¾¶§µ¯­­­. All these letters can be seamlessly typed from a FR-CA keyboard with AltGr or two keys combination.
    • You can simplify the apg complexity because of this one since it's adding many letters of entropy and each of these letters is ~3 bytes of utf-8, dramatically increasing the effective password length.
    • If you are selecting your password on linux, don't forget that Windows won't accept certain combinations like ȩȨ ÝŸŷŶ. You may want to not use them if you ever plan to login from a windows workstation.
    • «»° aren't accessible on all FR-CA keyboard so you need to memorize the Alt-Numlock combination.
    • Similar alternatives for Spanish people: ¿¡
  • You now have a password that:
    • is mostly copy-paste safe
    • is uncrackable by most rainbow tables. Who generates a utf8 rainbow table with ½ or µ with length of 12 characters?
    • will probably not be accepted by most web sites since it's too secure. :(

Wednesday, 19 January 2011

1 kibioctet = 1023,937522 octets

Vous cherchez à propos de l'affiche installé dans votre université? Je fais une présentation à propos de la corruption silencieuse de données le 27 janvier à l'École Polytechnique de Montréal et le 28 janvier à l'Université Laval à Québec.

Mais en premier, trouvez la réponse à l'énigme.

Bonne Chance!

Note: certains chercherons peut-être 1 kibioctet = 1023.937522 octets même si l'affiche utilise une virgule.

Mise à jour #1

Si vous avez de la difficulté à trouver la réponse, je vous conseille d'appliquer:

Mise à jour #2

L'énigme n'a rien à voir avec 1024 en particulier.


Mise à jour #3

Monsieur Munroe a fait une coquille en écrivant ce nombre.


Mise à jour #4

En rapport avec l'indice précédent, si vous donnez un url comme réponse, "/394" en fera partie.


Mise à jour #5

Le tout a été causé par un manufacturier.


Mise à jour #6
(Mis à jour 2011-03-29)

Visitez maruel.github.com pour voir la présentation.