How does Cassandra partitioning work when replication factor == cluster size? -


background:

i'm new cassandra , still trying wrap mind around internal workings.

i'm thinking of using cassandra in application ever have limited number of nodes (less 10, commonly 3). ideally each node in cluster have complete copy of of application data. so, i'm considering setting replication factor cluster size. when additional nodes added, alter keyspace increment replication factor setting (nodetool repair ensure gets necessary data).

i using networktopologystrategy replication take advantage of knowledge datacenters.

in situation, how partitioning work? i've read combination of nodes , partition keys forming ring in cassandra. if of nodes "responsible" each piece of data regardless of hash value calculated partitioner, have ring of 1 partition key?

are there tremendous downfalls type of cassandra deployment? i'm guessing there lots of asynchronous replication going on in background data propagated every node, 1 of design goals i'm okay it.

the consistency level on reads "one" or "local_one".

the consistency level on writes "two".

actual questions answer:

  1. is replication factor == cluster size common (or reasonable) deployment strategy aside obvious case of cluster of one?
  2. do have ring of 1 partition possible values generated partitioner go 1 partition?
  3. is each node considered "responsible" every row of data?
  4. if use write consistency of "one" cassandra write data node contacted client?
  5. are there other downfalls strategy don't know about?

do have ring of 1 partition possible values generated partitioner go 1 partition?

is each node considered "responsible" every row of data?

if of nodes "responsible" each piece of data regardless of hash value calculated partitioner, have ring of 1 partition key?

not exactly, c* nodes still have token ranges , c* still assigns primary replica "responsible" node. nodes have replica rf = n (where n number of nodes). in essence implication same described.

are there tremendous downfalls type of cassandra deployment? there other downfalls strategy don't know about?

not can think of, guess might more susceptible average inconsistent data use c*'s anti-entropy mechanisms counter (repair, read repair, hinted handoff).

consistency level quorum or start expensive see don't intend use them.

is replication factor == cluster size common (or reasonable) deployment strategy aside obvious case of cluster of one?

it's not common, guess looking super high availability , data fits on 1 box. don't think i've ever seen c* deployment rf > 5. far , wide rf = 3.

if use write consistency of "one" cassandra write data node contacted client?

this depends on load balancing policies @ driver. select token aware policies (assuming you're using 1 of datastax drivers), in case requests routed primary replica automatically. use round robin in case , have same effect.


Comments

Popular posts from this blog

css - SVG using textPath a symbol not rendering in Firefox -

Java 8 + Maven Javadoc plugin: Error fetching URL -

order - Notification for user in user account opencart -