Skip to main content

 

Ocassionally, I read this article, that is quiet useful when you consider how to implement non-functional requirements. In a word, any method is a double edge sword, that can benefit you ,also can give you trouble, so using them also based on your situations.

Here is the detail.


Sharding is a database technique where you break up a big database into many smaller ones. Instead of having 1 million customers on a single, big iron machine, you perhaps have 100,000 customers on 10 different, smaller machines.

The general advise on sharding is that you don’t until you have to. It’s similar to Martin Fowler’s First Law of Distributed Object Design: Don’t distribute your objects! Sharding is still relatively hard, has relatively poor tool support, and will definitely complicate your setup.

Now I always knew that the inevitable day would come where we would have no choice. We would simply have to shard because there was no more vertical scaling to be done. But that day seems to get pushed further and further into the future.

Bigger caches, more reads
Our read performance is in some aspect being taken care of by the fact that you can get machines with 256GB RAM now. We upgraded the Basecampdatabase server from 32GB to 128GB RAM a while back and we thought that would be the end of it.

The box was maxed out and going beyond 128GB at the time was stupid expensive. But now there’s 256GB to be had at a reasonable price and I’m starting to think that by the time we reach that, there’ll be reasonably priced 512GB machines.

So as long as Moore’s law can give us capacity jumps like that, we can keep the entire working set in memory and all will be good. And even if we should hit a ceiling there, we can still go to active read slaves before worrying about sharding.

The bigger problem is writes
Traditionally it hasn’t been read performance that caused people to shard anyway. It has been write performance. Our applications are still very heavy on the reads vs writes, so it’s less of a problem than it is for many others.

But with the rise of SSD, like Fusion-IO’s ioDrive that can do 120K IOPS, it seems that we’re going to be saved by the progress of technology once again by the time we’ll need it.

Punt on sharding
So where does that leave sharding? For us, we’re in the same position we’ve been in for the past few years. We just don’t need to pay the complexity tax yet, so we don’t. That’s not to say that sharding doesn’t have other benefits than simply allowing that which otherwise couldn’t be, but the trade is not yet good enough.

One point of real pain we’ve suffered, though, is that migrating a database schema in MySQL on a huge table takes forever and a day. That’s a very real problem if you want to avoid an enterprisey schema full of kludges put in place to avoid adding, renaming, or dropping columns on big tables. Or avoid long scheduled maintenance windows.

I really hope that the clever chaps at MySQL comes up with something more reasonable for that problem, though. I’m told that PostgreSQL is a lot more accommodating in this regard, so hopefully competition will rise all boats for that.

Don’t try to preempt tomorrow
I guess the conclusion is that there’s no use in preempting the technological progress of tomorrow. Machines will get faster and cheaper all the time, but you’ll still only have the same limited programming resources that you had yesterday.

If you can spend them on adding stuff that users care about instead of prematurely optimizing for the future, you stand a better chance of being in business when that tomorrow finally rolls around.

Comments

Popular posts from this blog

business intelligence 2.0

 After long term of development, Business intelligence was moving into a new stage. that’s so-called BI2.0. The new term was introduced by BO corporation and trying to influence industry in an unnatural way. After I studied the whole concept, I came up into these meaningful points, Clearly, I got rid of some tedious and unrealistic content pushed by vendors. 1.What’s business intelligence 2.0? Business intelligence 2.0 is a term most likely named after web 2.0. Based on the BI of the first generation, BI2.0 provide more experience to meet user's requirements in a larger scope, like more easy-use UI, platformlise, .etc. 2. The new characters of BI 2.0 1. it enable user to query dynamically real time data 2. more web and browser-based approached to business data. 3.imply a trend towards moving away from the standard data warehouse that business intelligence tools has used,applying a new way to relate information quickly from many sources. 3. Why BI.20? Will we continue to follow a ne...

云计算

云计算是一个新潮而且听起来很牛的名词,今天探其根源,俺还是有点体会。 云计算可以说是分布式计算,网格计算的近亲,或者干脆点,就是它们的衍生品。这里的概念模糊不清,关系也并不明确。总之,不同机构有不同解释。 讲讲我的经历。记得最早接触分布式计算是在第一份工作,一个关于电信计费的大型项目。谈到计费就往往少不了数据的问题,比如如何接入通信网络,如何收集数据,也就是ETL。由于数据量庞大,这里ETL暴露出几个问题: 1.如何保证扩展性? 2.如何保证24×7? 3.如何保证处理效率? 4.灾难发生,如何补偿? 回想起那个时候的技术架构,简直就是糟糕到底。一群学院派的老前辈用OLTP方式处理这些数据ETL的问题,数据处理全部集中到数据库,其后果也是恐怖到底。好了,几个月后,数据库就是永远的瓶颈。 现在看来,ETL是最适合分布式计算的应用场景。原因: 1.被处理数据无需考虑状态。可以很随意拆分处理。 2.ETL处理的数据量往往很大,需首要考虑性能问题。分布式可以更好的提高效率。 3.既然有分布式的结构存在,自然扩展性和可靠性就很高。 当时自己也在实践中看到了这一点,所以写了一个很小的通信模块用来实现扩展性。原理很简单,只要这个程序运行,一旦它接收到控制信号,就取出一部分原始数据,执行设定的业务逻辑代码。记得当时写了两个版本,一个用UNIX C,一个用delphi。以后逐渐地把原来放在数据库中的处理,如数据清洗, 移到这些分布模块去做。很大程度上减少了数据库压力,而且效率和扩展性上有了很好的提升,架构也清晰多了。后来又完成了一个程序,起了个名字叫控制面板,就是用来监控各个分布程序的状态。基本上,这样的系统就成了分布式的并发处理系统,尽管很多其它的问题还没有考虑。 以后,偶然机会看到了一个开源项目,叫巡天望远镜计划( http://lsstcorp.org/ ),关注了一段时间。发现其IT系统的一个模块,用来收集来自射电望远镜的数据,处理方式也是采用分布式并行处理方式,架构与我曾思考的类似。但这样的科研项目面临的挑战就大多了,其一是数据量大,50G/天。其二是,你如何找到足够的机器去完成分布运算环境。靠网友的捐助是比较好的方法,比如贡献你的机器,让它在晚上加入运算网络。当募捐到足够的机器时,一个庞大的计算网格形成了。 当回来商业环境中,我们发现类似的分布式系统还有很多,但总不能都让...

通过sqlplus来查询系统cpu时间

oracle惯用的V$OSSTAT性能视图可以随时监控系统中的资源消耗,这其中包括系统cpu的使用率。 通常我们使用sar来完成这个工作,如果通过sqlplus来实现,可以编制一套package完成一体化监控分析目的。当然,对于监控还有很多中手段,区别在于复杂度和准确,即时性。每个人都可以选择不同方式实现,我个人认为使用自己最熟悉的方式最为妥当,这样可以避免无需的学习成本。 以下是监控脚本内容:   CREATE OR REPLACE TYPE osstat_record IS OBJECT (   date_time TIMESTAMP,   idle_time NUMBER,   user_time NUMBER,   sys_time NUMBER,   iowait_time NUMBER,   nice_time NUMBER ); / CREATE OR REPLACE TYPE osstat_table AS TABLE OF osstat_record; / CREATE OR REPLACE FUNCTION osstat(p_interval IN NUMBER, p_count IN NUMBER)    RETURN osstat_table    PIPELINED IS   l_t1 osstat_record;   l_t2 osstat_record;   l_out osstat_record;   l_num_cpus NUMBER;   l_total NUMBER; BEGIN   l_t1 := osstat_record(NULL, NULL, NULL, NULL, NULL, NULL);   l_t2 := osstat_record(NULL, NULL, NULL, NULL, NULL, NULL);      SELECT value    INTO l_num_cpus   FROM v$osstat    WHERE stat_name = 'NUM_CPUS';      FOR i IN 1..p_count   LOOP     SELECT sum(decode(stat_name,'IDLE_TIME', value, NULL)) as idle_time,  ...