1
0
Fork 0
mirror of https://we.phorge.it/source/phorge.git synced 2024-11-27 01:02:42 +01:00

Merge branch 'flavortext'

This commit is contained in:
epriestley 2011-05-15 08:13:47 -07:00
commit 01bb30f905
10 changed files with 976 additions and 2 deletions

View file

@ -6,6 +6,7 @@
"config" : "Configuration",
"contrib" : "Contributing",
"userguide" : "Application User Guides",
"flavortext" : "Flavor Text",
"developer" : "Phabricator Developer Guides",
"differential" : "Differential (Code Review)",
"diffusion" : "Diffusion (Repository Browser)",

View file

@ -30,8 +30,9 @@ introduce new contributors to the codebase.
You should read the relevant coding convention documents before you submit a
change and make sure you're following the project guidelines:
- @{article:General Coding Conventions} (for all languages)
- @{article:PHP Coding Conventions} (for PHP)
- @{article:General Coding Standards} (for all languages)
- @{article:PHP Coding Standards} (for PHP)
- @{article:Javascript Coding Standards} (for Javascript)
= Philosophy =

View file

@ -0,0 +1,9 @@
@title About Flavor Text
@group flavortext
Explains what's going on here.
= Overview =
Flavor Text is a collection of short articles which pertain to software
development in general, not necessarily to Phabricator specifically.

View file

@ -0,0 +1,153 @@
@title Javascript Object and Array
@group flavortext
This document describes the behaviors of Object and Array in Javascript, and
a specific approach to their use which produces basically reasonable language
behavior.
= Primitives =
Javascript has two native datatype primitives, Object and Array. Both are
classes, so you can use ##new## to instantiate new objects and arrays:
COUNTEREXAMPLE
var a = new Array(); // Not preferred.
var o = new Object();
However, **you should prefer the shorthand notation** because it's more concise:
lang=js
var a = []; // Preferred.
var o = {};
(A possible exception to this rule is if you want to use the allocation behavior
of the Array constructor, but you almost certainly don't.)
The language relationship between Object and Array is somewhat tricky. Object
and Array are both classes, but "object" is also a primitive type. Object is
//also// the base class of all classes.
lang=js
typeof Object; // "function"
typeof Array; // "function"
typeof {}; // "object"
typeof []; // "object"
var a = [], o = {};
o instanceof Object; // true
o instanceof Array; // false
a instanceof Object; // true
a instanceof Array; // true
= Objects are Maps, Arrays are Lists =
PHP has a single ##array## datatype which behaves like as both map and a list,
and a common mistake is to treat Javascript arrays (or objects) in the same way.
**Don't do this.** It sort of works until it doesn't. Instead, learn how
Javascript's native datatypes work and use them properly.
In Javascript, you should think of Objects as maps ("dictionaries") and Arrays
as lists ("vectors").
You store keys-value pairs in a map, and store ordered values in a list. So,
store key-value pairs in Objects.
var o = { // Good, an object is a map.
name: 'Hubert',
species: 'zebra'
};
console.log(o.name);
...and store ordered values in Arrays.
var a = [1, 2, 3]; // Good, an array is a list.
a.push(4);
Don't store key-value pairs in Arrays and don't expect Objects to be ordered.
COUNTEREXAMPLE
var a = [];
a['name'] = 'Hubert'; // No! Don't do this!
This technically works because Arrays are Objects and you think everything is
fine and dandy, but it won't do what you want and will burn you.
= Iterating over Maps and Lists =
Iterate over a map like this:
lang=js
for (var k in object) {
f(object[k]);
}
NOTE: There's some hasOwnProperty nonsense being omitted here, see below.
Iterate over a list like this:
lang=js
for (var ii = 0; ii < list.length; ii++) {
f(list[ii]);
}
NOTE: There's some sparse array nonsense being omitted here, see below.
If you try to use ##for (var k in ...)## syntax to iterate over an Array, you'll
pick up a whole pile of keys you didn't intend to and it won't work. If you try
to use ##for (var ii = 0; ...)## syntax to iterate over an Object, it won't work
at all.
If you consistently treat Arrays as lists and Objects as maps and use the
corresponding iterators, everything will pretty much always work in a reasonable
way.
= hasOwnProperty() =
An issue with this model is that if you write stuff to Object.prototype, it will
show up every time you use enumeration ##for##:
COUNTEREXAMPLE
var o = {};
Object.prototype.duck = "quack";
for (var k in o) {
console.log(o[k]); // Logs "quack"
}
There are two ways to avoid this:
- test that ##k## exists on ##o## by calling ##o.hasOwnProperty(k)## in every
single loop everywhere in your program and only use libraries which also do
this and never forget to do it ever; or
- don't write to Object.prototype.
Of these, the first option is terrible garbage. Go with the second option.
= Sparse Arrays =
Another wrench in this mess is that Arrays aren't precisely like lists, because
they do have indexes and may be sparse:
var a = [];
a[2] = 1;
console.log(a); // [undefined, undefined, 1]
The correct way to deal with this is:
for (var ii = 0; ii < list.length; ii++) {
if (list[ii] == undefined) {
continue;
}
f(list[ii]);
}
Avoid sparse arrays if possible.
= Ordered Maps =
If you need an ordered map, you need to have a map for key-value associations
and a list for key order. Don't try to build an ordered map using one Object or
one Array. This generally applies for other complicated datatypes, as well; you
need to build them out of more than one primitive.

View file

@ -0,0 +1,88 @@
@title Javascript Pitfalls
@group flavortext
This document discusses pitfalls and flaws in the Javascript language, and how
to avoid, work around, or at least understand them.
= Implicit Semicolons =
Javascript tries to insert semicolons if you forgot them. This is a pretty
horrible idea. Notably, it can mask syntax errors by transforming subexpressions
on their own lines into statements with no effect:
lang=js
string = "Here is a fairly long string that does not fit on one "
"line. Note that I forgot the string concatenation operators "
"so this will compile and execute with the wrong behavior. ";
Here's what ECMA262 says about this:
When, as the program is parsed ..., a token ... is encountered that is not
allowed by any production of the grammar, then a semicolon is automatically
inserted before the offending token if one or more of the following conditions
is true: ...
To protect yourself against this "feature", don't use it. Always explicitly
insert semicolons after each statement. You should also prefer to break lines in
places where insertion of a semicolon would not make the unparseable parseable,
usually after operators.
= ##with## is Bad News =
##with## is a pretty bad feature, for this reason among others:
with (object) {
property = 3; // Might be on object, might be on window: who knows.
}
Avoid ##with##.
= ##arguments## is not an Array =
You can convert ##arguments## to an array using JX.$A() or similar. Note that
you can pass ##arguments## to Function.prototype.apply() without converting it.
= Object, Array, and iteration are needlessly hard =
There is essentially only one reasonable, consistent way to use these primitives
but it is not obvious. Navigate these troubled waters with
@{article:Javascript Object and Array}.
= typeof null == "object" =
This statement is true in Javascript:
typeof null == 'object'
This is pretty much a bug in the language that can never be fixed now.
= Number, String, and Boolean objects =
Like Java, Javascript has primitive versions of number, string, and boolean,
and object versions. In Java, there's some argument for this distinction. In
Javascript, it's pretty much completely worthless and the behavior of these
objects is wrong. String and Boolean in particular are essentially unusable:
lang=js
"pancake" == "pancake"; // true
new String("pancake") == new String("pancake"); // false
var b = new Boolean(false);
b; // Shows 'false' in console.
!b; // ALSO shows 'false' in console.
!b == b; // So this is true!
!!b == !b // Negate both sides and it's false! FUCK!
if (b) {
// Better fucking believe this will get executed.
}
There is no advantage to using the object forms (the primitive forms behave like
objects and can have methods and properties, and inherit from Array.prototype,
Number.prototype, etc.) and their logical behavior is at best absurd and at
worst strictly wrong.
**Never use** ##new Number()##, ##new String()## or ##new Boolean()## unless
your Javascript is God Tier and you are absolutely sure you know what you are
doing.

View file

@ -0,0 +1,301 @@
@title PHP Pitfalls
@group flavortext
This document discusses difficult traps and pitfalls in PHP, and how to avoid,
work around, or at least understand them.
= array_merge() in Incredibly Slow =
If you merge arrays like this:
COUNTEREXAMPLE
$result = array();
foreach ($list_of_lists as $one_list) {
$result = array_merge($result, $one_list);
}
...your program now has a huge runtime runtime because it generates a large
number of intermediate arrays and copies every element it has previously seen
each time you iterate.
In a libphutil environment, you can use ##array_mergev($list_of_lists)##
instead.
= var_export() Hates Baby Animals =
If you try to var_export() an object that contains recursive references, your
program will terminate. You have no chance to intercept or react to this or
otherwise stop it from happening. Avoid var_export() unless you are certain
you have only simple data. You can use print_r() or var_dump() to display
complex variables safely.
= isset(), empty() and Truthiness =
A value is "truthy" if it evaluates to true in an ##if## clause:
$value = something();
if ($value) {
// Value is truthy.
}
If a value is not truthy, it is "falsey". These values are falsey in PHP:
null // null
0 // integer
0.0 // float
"0" // string
"" // empty string
false // boolean
array() // empty array
Disregarding some bizarre edge cases, all other values are truthy. Note that
because "0" is falsey, this sort of thing (intended to prevent users from making
empty comments) is wrong in PHP:
COUNTEREXAMPLE
if ($comment_text) {
make_comment($comment_text);
}
This is wrong because it prevents users from making the comment "0". //THIS
COMMENT IS TOTALLY AWESOME AND I MAKE IT ALL THE TIME SO YOU HAD BETTER NOT
BREAK IT!!!// A better test is probably strlen().
In addition to truth tests with ##if##, PHP has two special truthiness operators
which look like functions but aren't: empty() and isset(). These operators help
deal with undeclared variables.
In PHP, there are two major cases where you get undeclared variables -- either
you directly use a variable without declaring it:
COUNTEREXAMPLE
function f() {
if ($not_declared) {
// ...
}
}
...or you index into an array with an index which may not exist:
COUNTEREXAMPLE
function f(array $mystery) {
if ($mystery['stuff']) {
// ...
}
}
When you do either of these, PHP issues a warning. Avoid these warnings by using
empty() and isset() to do tests that are safe to apply to undeclared variables.
empty() evaluates truthiness exactly opposite of if(). isset() returns true for
everything except null. This is the truth table:
VALUE if() empty() isset()
null false true false
0 false true true
0.0 false true true
"0" false true true
"" false true true
false false true true
array() false true true
EVERYTHING ELSE true false true
The value of these operators is that they accept undeclared variables and do not
issue a warning. Specifically, if you try to do this you get a warning:
COUNTEREXAMPLE
if ($not_previously_declared) { // PHP Notice: Undefined variable!
// ...
}
But these are fine:
if (empty($not_previously_declared)) { // No notice, returns true.
// ...
}
if (isset($not_previously_declared)) { // No notice, returns false.
// ...
}
So, isset() really means is_declared_and_is_set_to_something_other_than_null().
empty() really means is_falsey_or_is_not_declared(). Thus:
- If a variable is known to exist, test falsiness with if (!$v), not empty().
In particular, test for empty arrays with if (!$array). There is no reason
to ever use empty() on a declared variable.
- When you use isset() on an array key, like isset($array['key']), it will
evaluate to "false" if the key exists but has the value null! Test for index
existence with array_key_exists().
Put another way, use isset() if you want to type "if ($value !== null)" but are
testing something that may not be declared. Use empty() if you want to type
"if (!$value)" but you are testing something that may not be declared.
= usort(), uksort(), and uasort() are Slow =
This family of functions is often extremely slow for large datasets. You should
avoid them if at all possible. Instead, build an array which contains surrogate
keys that are naturally sortable with a function that uses native comparison
(e.g., sort(), asort(), ksort(), or natcasesort()). Sort this array instead, and
use it to reorder the original array.
In a libphutil environment, you can often do this easily with isort() or
msort().
= array_intersect() and array_diff() are Also Slow =
These functions are much slower for even moderately large inputs than
array_intersect_key() and array_diff_key(), because they can not make the
assumption that their inputs are unique scalars as the ##key## varieties can.
Strongly prefer the ##key## varieties.
= array_uintersect() and array_udiff() are Definitely Slow Too =
These functions have the problems of both the ##usort()## family and the
##array_diff()## family. Avoid them.
= foreach() Does Not Create Scope =
Variables survive outside of the scope of foreach(). More problematically,
references survive outside of the scope of foreach(). This code mutates
##$array## because the reference leaks from the first loop to the second:
COUNTEREXAMPLE
$array = range(1, 3);
echo implode(',', $array); // Outputs '1,2,3'
foreach ($array as &$value) {}
echo implode(',', $array); // Outputs '1,2,3'
foreach ($array as $value) {}
echo implode(',', $array); // Outputs '1,2,2'
The easiest way to avoid this is to avoid using foreach-by-reference. If you do
use it, unset the reference after the loop:
foreach ($array as &$value) {
// ...
}
unset($value);
= unserialize() is Incredibly Slow on Large Datasets =
The performance of unserialize() is nonlinear in the number of zvals you
unserialize, roughly O(N^2).
zvals approximate time
10000 5ms
100000 85ms
1000000 8,000ms
10000000 72 billion years
= call_user_func() Breaks References =
If you use call_use_func() to invoke a function which takes parameters by
reference, the variables you pass in will have their references broken and will
emerge unmodified. That is, if you have a function that takes references:
function add_one(&$v) {
$v++;
}
...and you call it with call_user_func():
COUNTEREXAMPLE
$x = 41;
call_user_func('add_one', $x);
...##$x## will not be modified. The solution is to use call_user_func_array()
and wrap the reference in an array:
$x = 41;
call_user_func_array(
'add_one',
array(&$x)); // Note '&$x'!
This will work as expected.
= You Can't Throw From __toString() =
If you throw from __toString(), your program will terminate uselessly and you
won't get the exception.
= An Object Can Have Any Scalar as a Property =
Object properties are not limited to legal variable names:
$property = '!@#$%^&*()';
$obj->$property = 'zebra';
echo $obj->$property; // Outputs 'zebra'.
So, don't make assumptions about property names.
= There is an (object) Cast =
You can cast a dictionary into an object.
$obj = (object)array('flavor' => 'coconut');
echo $obj->flavor; // Outputs 'coconut'.
echo get_class($obj); // Outputs 'stdClass'.
This is occasionally useful, mostly to force an object to become a Javascript
dictionary (vs a list) when passed to json_encode().
= Invoking "new" With an Argument Vector is Really Hard =
If you have some ##$class_name## and some ##$argv## of constructor
arguments and you want to do this:
new $class_name($argv[0], $argv[1], ...);
...you'll probably invent a very interesting, very novel solution that is very
wrong. In a libphutil environment, solve this problem with newv(). Elsewhere,
copy newv()'s implementation.
= Equality is not Transitive =
This isn't terribly surprising since equality isn't transitive in a lot of
languages, but the == operator is not transitive:
$a = ''; $b = 0; $c = '0a';
$a == $b; // true
$b == $c; // true
$c == $a; // false!
When either operand is an integer, the other operand is cast to an integer
before comparison. Avoid this and similar pitfalls by using the === operator,
which is transitive.
= All 676 Letters in the Alphabet =
This doesn't do what you'd expect it to do in C:
for ($c = 'a'; $c <= 'z'; $c++) {
// ...
}
This is because the successor to 'z' is 'aa', which is "less than" 'z'. The
loop will run for ~700 iterations until it reaches 'zz' and terminates. That is,
##$c<## will take on these values:
a
b
...
y
z
aa // loop continues because 'aa' <= 'z'
ab
...
mf
mg
...
zw
zx
zy
zz // loop now terminates because 'zz' > 'z'
Instead, use this loop:
foreach (range('a', 'z') as $c) {
// ...
}

View file

@ -0,0 +1,140 @@
@title Things You Should Do Now
@group flavortext
Describes things you should do now when building software, because the cost to
do them increases over time and eventually becomes prohibitive or impossible.
= Overview =
If you're building a hot new web startup, there are a lot of decisions to make
about what to focus on. Most things you'll build will take about the same amount
of time to build regardless of what order you build them in, but there are a few
technical things which become vastly more expensive to fix later.
If you don't do these things early in development, they'll become very hard or
impossible to do later. This is basically a list of things that would have saved
Facebook huge amounts of time and effort down the road if someone had spent
a tiny amount of time on them earlier in the development process.
See also @{article:Things You Should Do Soon} for things that scale less
drastically over time.
= Start IDs At a Gigantic Number =
If you're using integer IDs to identify data or objects, **don't** start your
IDs at 1. Start them at a huge number (e.g., 2^33) so that no object ID will
ever appear in any other role in your application (like a count, a natural
index, a byte size, a timestamp, etc). This takes about 5 seconds if you do it
before you launch and rules out a huge class of nasty bugs for all time. It
becomes incredibly difficult as soon as you have production data.
The kind of bug that this causes is accidental use of some other value as an ID:
COUNTEREXAMPLE
// Load the user's friends, returns a map of friend_id => true
$friend_ids = user_get_friends($user_id);
// Get the first 8 friends.
$first_few_friends = array_slice($friend_ids, 0, 8);
// Render those friends.
render_user_friends($user_id, array_keys($first_few_friends));
Because array_slice() in PHP discards array indices and renumbers them, this
doesn't render the user's first 8 friends but the users with IDs 0 through 7,
e.g. Mark Zuckerberg (ID 4) and Dustin Moskovitz (ID 6). If you have IDs in this
range, sooner or later something that isn't an ID will get treated like an ID
and the operation will be valid and cause unexpected behavior. This is
completely avoidable if you start your IDs at a gigantic number.
= Only Store Valid UTF-8 =
For the most part, you can ignore UTF-8 and unicode until later. However, there
is one aspect of unicode you should address now: store only valid UTF-8 strings.
Assuming you're storing data internally as UTF-8 (this is almost certainly the
right choice and definitely the right choice if you have no idea how unicode
works), you just need to sanitize all the data coming into your application and
make sure it's valid UTF-8.
If your application emits invalid UTF-8, other systems (like browsers) will
break in unexpected and interesting ways. You will eventually be forced to
ensure you emit only valid UTF-8 to avoid these problems. If you haven't
sanitized your data, you'll basically have two options:
- do a huge migration on literally all of your data to sanitize it; or
- forever sanitize all data on its way out on the read pathways.
As of 2011 Facebook is in the second group, and spends several milliseconds of
CPU time sanitizing every display string on its way to the browser, which
multiplies out to hundreds of servers worth of CPUs sitting in a datacenter
paying the price for the invalid UTF-8 in the databases.
You can likely learn enough about unicode to be confident in an implementation
which addresses this problem within a few hours. You don't need to learn
everything, just the basics. Your language probably already has a function which
does the sanitizing for you.
= Never Design a Blacklist-Based Security System =
When you have an alternative, don't design security systems which are default
permit, blacklist-based, or otherwise attempt to enumerate badness. When
Facebook launched Platform, it launched with a blacklist-based CSS filter, which
basically tried to enumerate all the "bad" parts of CSS and filter them out.
This was a poor design choice and lead to basically infinite security holes for
all time.
It is very difficult to enumerate badness in a complex system and badness is
often a moving target. Instead of trying to do this, design whitelist-based
security systems where you list allowed things and reject anything you don't
understand. Assume things are bad until you verify that they're OK.
It's tempting to design blacklist-based systems because they're easier to write
and accept more inputs. In the case of the CSS filter, the product goal was for
users to just be able to use CSS normally and feel like this system was no
different from systems they were familiar with. A whitelist-based system would
reject some valid, safe inputs and create product friction.
But this is a much better world than the alternative, where the blacklist-based
system fails to reject some dangerous inputs and creates //security holes//. It
//also// creates product friction because when you fix those holes you break
existing uses, and that backward-compatibility friction makes it very difficult
to move the system from a blacklist to a whitelist. So you're basically in
trouble no matter what you do, and have a bunch of security holes you need to
unbreak immediately, so you won't even have time to feel sorry for yourself.
Designing blacklist-based security is one of the worst now-vs-future tradeoffs
you can make. See also "The Six Dumbest Ideas in Computer Security":
http://www.ranum.com/security/computer_security/
= Fail Very Loudly when SQL Syntax Errors Occur in Production =
This doesn't apply if you aren't using SQL, but if you are: detect when a query
fails because of a syntax error (in MySQL, it is error 1064). If the failure
happened in production, fail in the loudest way possible. (I implemented this in
2008 at Facebook and had it just email me and a few other people directly. The
system was eventually refined.)
This basically creates a high-signal stream that tells you where you have SQL
injection holes in your application. It will have some false positives and could
theoretically have false negatives, but at Facebook it was pretty high signal
considering how important the signal is.
Of course, the real solution here is to not have SQL injection holes in your
application, ever. As far as I'm aware, this system correctly detected the one
SQL injection hole we had from mid-2008 until I left in 2011, which was in a
hackathon project on an underisolated semi-production tier and didn't use the
query escaping system the rest of the application does.
Hopefully, whatever language you're writing in has good query libraries that
can handle escaping for you. If so, use them. If you're using PHP and don't have
a solution in place yet, the Phabricator implementation of qsprintf() is similar
to Facebook's system and was successful there.

View file

@ -0,0 +1,136 @@
@title Things You Should Do Soon
@group flavortext
Describes things you should start thinking about soon, because scaling will
be easier if you put a plan in place.
= Overview =
Stop! Don't do any of this yet. Go do @{article:Things You Should Do Now} first.
Then you can come back and read about these things. These are basically some
problems you'll probably eventually encounter when building a web application
that might be good to start thinking about now.
= Static Resources: JS and CSS =
Over time, you'll write more JS and CSS and eventually need to put systems in
place to manage it.
== Manage Dependencies Automatically ==
The naive way to add static resources to a page is to include them at the top
of the page, before rendering begins, by enumerating filenames. Facebook used to
work like that:
COUNTEREXAMPLE
<?php
require_js('js/base.js');
require_js('js/utils.js');
require_js('js/ajax.js');
require_js('js/dialog.js');
// ...
This was okay for a while but had become unmanageable by 2007. Because
dependencies were managed completely manually and you had to explicitly list
every file you needed in the right order, everyone copy-pasted a giant block
of this stuff into every page. The major problem this created was that each page
pulled in way too much JS, which slowed down frontend performance.
We moved to a system (called //Haste//) which declared JS dependencies in the
files using a docblock-like header:
/**
* @provides dialog
* @requires utils ajax base
*/
We annotated files manually, although theoretically you could use static
analysis instead (we couldn't realistically do that, our JS was pretty
unstructured). This allowed us to pull in the entire dependency chain of
component with one call:
require_static('dialog');
...instead of copy-pasting every dependency.
== Include When Used ==
The other part of this problem was that all the resources were required at the
top of the page instead of when they were actually used. This meant two things:
- you needed to include every resource that //could ever// appear on a page;
- if you were adding something new to 2+ pages, you had a strong incentive to
put it in base.js.
So every page pulled in a bunch of silly stuff like the CAPTCHA code (because
there was one obscure workflow involving unverified users which could
theoretically show any user a CAPTCHA on any page) and every random thing anyone
had stuck in base.js.
We moved to a system where JS and CSS tags were output **after** page rendering
had run instead (they still appeared at the top of the page, they were just
prepended rather than appended before being output to the browser -- there are
some complexities here, but they are beyond the immediate scope), so
require_static() could appear anywhere in the code. Then we moved all the
require_static() calls to be proximate to their use sites (so dialog rendering
code would pull in dialog-related CSS and JS, for example, not any page which
might need a dialog), and split base.js into a bunch of smaller files.
== Packaging ==
The biggest frontend performance killer in most cases is the raw number of HTTP
requests, and the biggest hammer for addressing it is to package related JS
and CSS into larger files, so you send down all the core JS code in one big file
instead of a lot of smaller ones. Once the other groundwork is in place, this is
a relatively easy change. We started with manual package definitions and
eventually moved to automatic generation based on production data.
== Caches and Serving Content ==
In the simplest implementation of static resources, you write out a raw JS tag
with something like ##src="/js/base.js"##. This will break disastrously as you
scale, because clients will be running with stale versions of resources. There
are bunch of subtle problems (especially once you have a CDN), but the big one
is that if a user is browsing your site as you push/deploy, their client will
not make requests for the resources they already have in cache, so even if your
servers respond correctly to If-None-Match (ETags) and If-Modified-Since
(Expires) the site will appear completely broken to everyone who was using it
when you push a breaking change to static resources.
The best way to solve this problem is to version your resources in the URI,
so each version of a resource has a unique URI:
rsrc/af04d14/js/base.js
When you push, users will receive pages which reference the new URI so their
browsers will retrieve it.
**But**, there's a big problem, once you have a bunch of web frontends:
While you're pushing, a user may make a request which is handled by a server
running the new version of the code, which delivers a page with a new resource
URI. Their browser then makes a request for the new resource, but that request
is routed to a server which has not been pushed yet, which delivers an old
version of the resource. They now have a poisoned cache: old resource data for
a new resource URI.
You can do a lot of clever things to solve this, but the solution we chose at
Facebook was to serve resources out of a database instead of off disk. Before a
push begins, new resources are written to the database so that every server is
able to satisfy both old and new resource requests.
This also made it relatively easy to do processing steps (like stripping
comments and whitespace) in one place, and just insert a minified/processed
version of CSS and JS into the database.
== Reference Implementation: Celerity ==
Some of the ideas discussed here are implemented in Phabricator's //Celerity//
system, which is essentially a simplified version of the //Haste// system used
by Facebook.

View file

@ -0,0 +1,140 @@
@title Javascript Coding Standards
@group contrib
This document describes Javascript coding standards for Phabricator and Javelin.
= Overview =
This document outlines technical and style guidelines which are followed in
Phabricator and Javelin. Contributors should also follow these guidelines. Many
of these guidelines are automatically enforced by lint.
These guidelines are essentially identical to the Facebook guidelines, since I
basically copy-pasted them. If you are already familiar with the Facebook
guidelines, you can probably get away with skimming this document.
= Spaces, Linebreaks and Indentation =
- Use two spaces for indentation. Don't use literal tab characters.
- Use Unix linebreaks ("\n"), not MSDOS ("\r\n") or OS9 ("\r").
- Use K&R style braces and spacing.
- Put a space after control keywords like ##if## and ##for##.
- Put a space after commas in argument lists.
- Put space around operators like ##=##, ##<##, etc.
- Don't put spaces after function names.
- Parentheses should hug their contents.
- Generally, prefer to wrap code at 80 columns.
= Case and Capitalization =
The Javascript language unambiguously dictates casing/naming rules; follow those
rules.
- Name variables using ##lowercase_with_underscores##.
- Name classes using ##UpperCamelCase##.
- Name methods and properties using ##lowerCamelCase##.
- Name global functions using ##lowerCamelCase##. Avoid defining global
functions.
- Name constants using ##UPPERCASE##.
- Write ##true##, ##false##, and ##null## in lowercase.
- "Internal" methods and properties should be prefixed with an underscore.
For more information about what "internal" means, see
**Leading Underscores**, below.
= Comments =
- Strongly prefer ##//## comments for making comments inside the bodies of
functions and methods (this lets someone easily comment out a block of code
while debugging later).
= Javascript Language =
- Use ##[]## and ##{}##, not ##new Array## and ##new Object##.
- When creating an object literal, do not quote keys unless required.
= Examples =
**if/else:**
lang=js
if (x > 3) {
// ...
} else if (x === null) {
// ...
} else {
// ...
}
You should always put braces around the body of an if clause, even if it is only
one line. Note that operators like ##>## and ##===## are also surrounded by
spaces.
**for (iteration):**
lang=js
for (var ii = 0; ii < 10; ii++) {
// ...
}
Prefer ii, jj, kk, etc., as iterators, since they're easier to pick out
visually and react better to "Find Next..." in editors.
**for (enumeration):**
lang=js
for (var k in obj) {
// ...
}
Make sure you use enumeration only on Objects, not on Arrays. For more details,
see @{article:Javascript Object and Array}.
**switch:**
lang=js
switch (x) {
case 1:
// ...
break;
case 2:
if (flag) {
break;
}
break;
default:
// ...
break;
}
##break## statements should be indented to block level. If you don't push them
in, you end up with an inconsistent rule for conditional ##break## statements,
as in the ##2## case.
If you insist on having a "fall through" case that does not end with ##break##,
make it clear in a comment that you wrote this intentionally. For instance:
lang=js
switch (x) {
case 1:
// ...
// Fall through...
case 2:
//...
break;
}
= Leading Underscores =
By convention, methods names which start with a leading underscore are
considered "internal", which (roughly) means "private". The critical difference
is that this is treated as a signal to Javascript processing scripts that a
symbol is safe to rename since it is not referenced outside the current file.
The upshot here is:
- name internal methods which shouldn't be called outside of a file's scope
with a leading underscore; and
- **never** call an internal method from another file.
If you treat them as though they were "private", you won't run into problems.

View file

@ -9,6 +9,11 @@ This document outlines technical and style guidelines which are followed in
libphutil. Contributors should also follow these guidelines. Many of these
guidelines are automatically enforced by lint.
These guidelines are essentially identical to the Facebook guidelines, since I
basically copy-pasted them. If you are already familiar with the Facebook
guidelines, you probably don't need to read this super thoroughly.
= Spaces, Linebreaks and Indentation =
- Use two spaces for indentation. Don't use tab literal characters.