watchman - What are the performance and reliability implications of watching too many files? -


in facebook's watchman application, somewhere in docs says this:

most systems have finite limit on number of directories can watched effectively; when limit exceeded performance , reliability of filesystem watching degraded, point ceases function.

this seems vague me. before "ceases function", can expect happen if start watching many files? , talking 100 files, 1,000 files, 100,000 files..? (i realise number vary on different systems, rough idea of sensible limit modern unix laptop useful).

i have use case involve watching entire node_modules folder (which contains thousands of files in nested subdirectories), , want know before start work on whether it's complete non-starter.

sorry if docs aren't clear might like.

first up, built watchman accelerate tools have operate on extremely large trees, particularly one, has continued have gotten bigger , bigger since written:

https://code.facebook.com/posts/218678814984400/scaling-mercurial-at-facebook/

facebook's main source repository enormous--many times larger linux kernel, checked in @ 17 million lines of code , 44,000 files in 2013

i don't have more recent public numbers on repo size can share hand @ moment, main point here should work fine vast majority of applications.

now behavior of system when limits exceeded. answer depends on operating system you're using.

there 2 main system limits impact behavior; 1 of them direct limit on number of watched items; when exceeded, cannot watch else. when running on linux, watchman treat case unrecoverable , flag poisoned; when in state, impossible accurately report on file changes within scope of number of directories being watched until raise system limit, or give on trying watch part of filesystem.

when running on os x, watchman can't tell if limit exceeded due poor diagnostics in fsevents api; best can tell if unable initiate watch. because fsevents doesn't tell going on, , because limit not user configurable, can't put process poisoned state.

the other system limit on number of items kernel has buffered watchman process consume. if buffer overflowed kernel start drop change notifications. inform watchman did , cause watchman perform (likely, given tree presumably large) expensive tree recrawl make sure can (re-)discover changes might have missed due overflow.

os x has similar limit , similar recovery behavior, not allow user raise limit. i've yet observe happening on os x in wild i'm assuming whatever system limit defaults pretty reasonable default.

as practical limits various file sizes, depends on system; filesystem, storage device, cpu power , other applications may running on system impact rate @ changes can applied filesystem , reported kernel, , rate @ system able consume events kernel.

the rate @ change files big factor; if have large , busy tree changes (>100's of engineers making multiple commits per day , rebasing frequently) have increased risk of hitting recrawl case.

there's no one-size-fits-all answer tuning system limits; you'll need try out , bump limits if/when hit them.


Comments

Popular posts from this blog

css - SVG using textPath a symbol not rendering in Firefox -

Java 8 + Maven Javadoc plugin: Error fetching URL -

datatable - Matlab struct computations -