Azure Table inserts do not scale (Network Steve Forum)

Azure Table inserts do not scale

After investigating the error I reported on this post, I have found proof that something is going wrong with azure storage performance. After around 54070 * 10 batches or 54070 * 10 * 100 rows, any insert take a disproportionate amount of time.

To reproduce it, I just inserted dummy data in the partition and monitored how much time it took to insert 10 batch.
The code is joined.
I suspect this problem is very recent, I already had so much data without any problem at all.

Here is the source code to reproduce this scaling problem (you have to wait around (54070 * 10) batches until you start having problems)

static void Main(string[] args)
{
    StorageCredentials creds = new StorageCredentials("account", "pass");
    CloudStorageAccount account = new CloudStorageAccount(creds, false);
    var table = account.CreateCloudTableClient().GetTableReference("scaleproblem3");
    table.CreateIfNotExists();
    Random rand = new Random();
    var random = new byte[32];
    FileStream fs = File.Open("data.csv", FileMode.Append);
    StreamWriter writer = new StreamWriter(fs);
    int total = 0;
    int count = 0;
    Stopwatch watch = new Stopwatch();
    watch.Start();
    while (true)
    {

        TableBatchOperation batch = new TableBatchOperation();
        for (int i = 0 ; i < 100 ; i++)
        {
            rand.NextBytes(random);
            var rowKey = String.Join("", random.Select(b => b.ToString("00")).ToArray());
            batch.Add(TableOperation.InsertOrReplace(new DynamicTableEntity("a", rowKey)));
        }
        table.ExecuteBatch(batch);
        count++;
        if (count == 10)
        {
            total += count;
            count = 0;
            var elapsedSec = (int)watch.Elapsed.TotalSeconds;
            watch.Restart();
            string line = total + "," + elapsedSec;
            writer.WriteLine(line);
            writer.Flush();
            Console.WriteLine(line);
        }
    }
}

Edited by NicolasDorier Saturday, January 10, 2015 7:24 PM
Merged by Jambor yaoMicrosoft contingent staff, Moderator Monday, February 02, 2015 6:07 AM the same topic

January 10th, 2015 7:23pm

After investigating the error I reported on this post, I have found proof that something is going wrong with azure storage performance. After (54070 * 10 * 100) rows in a partition, any insert take disproportionate amount of time.

Here is the source code to reproduce this scaling problem (you have to wait around (54070 * 10) batches until you start having problems)

static void Main(string[] args)
{
    StorageCredentials creds = new StorageCredentials("account", "pass");
    CloudStorageAccount account = new CloudStorageAccount(creds, false);
    var table = account.CreateCloudTableClient().GetTableReference("scaleproblem3");
    table.CreateIfNotExists();
    Random rand = new Random();
    var random = new byte[32];
    FileStream fs = File.Open("data.csv", FileMode.Append);
    StreamWriter writer = new StreamWriter(fs);
    int total = 0;
    int count = 0;
    Stopwatch watch = new Stopwatch();
    watch.Start();
    while (true)
    {

        TableBatchOperation batch = new TableBatchOperation();
        for (int i = 0 ; i < 100 ; i++)
        {
            rand.NextBytes(random);
            var rowKey = String.Join("", random.Select(b => b.ToString("00")).ToArray());
            batch.Add(TableOperation.InsertOrReplace(new DynamicTableEntity("a", rowKey)));
        }
        table.ExecuteBatch(batch);
        count++;
        if (count == 10)
        {
            total += count;
            count = 0;
            var elapsedSec = (int)watch.Elapsed.TotalSeconds;
            watch.Restart();
            string line = total + "," + elapsedSec;
            writer.WriteLine(line);
            writer.Flush();
            Console.WriteLine(line);
        }
    }
}

Storage account West Europe like the VM that runned the test, same affinity group.

Edited by NicolasDorier Sunday, January 11, 2015 1:31 AM

Free Windows Admin Tool Kit Click here and download it now

January 10th, 2015 7:24pm

Yes TChiang.

The 54070 is approximately when I started having slowness. (reading from the graph I generated with the data in the repro code.

I was taking one sample (batchCount, insertTime) every 10 batch.

Which means that 54070 was equal to 10 * 54070 batch and so 10*54070*100 rows.

Insert instead of Insert or Replace does not change anything. Each row are approximately 100 bytes.

You can run this code yourself. You will see that around 54070 +- 5000, you'll start experience the same problem. (It took me several hours)

[UPDATE]

No, in fact, since it is the number of batch, the bug appeared after 54070 * 100 rows.
Nevertheless, the point is that after some threshold insert take a unreasonable amount of time suddenly.

I am almost sure this is a bug from the storage, since I never seen that before, and no documentation talk about the consequence of partition size on insert time.
The fact that it jump suddenly and not progressively is very strange also.

Also, the limit does not apply to the table but to the partition.
It is not storage library fault, like I show in the previous post, fiddler clearly show that the storage takes time to respond to http requests.

Edited by NicolasDorier Sunday, January 11, 2015 1:28 AM

January 11th, 2015 12:36am

This limit is undocumented.
Moreover, I never had the problem before. So I'd like confirmation about it from Microsoft, and knowing if they will fix it.

I don't consider using more partition. Using more would have impact on RAM requirement for my application. (Every time I had 1 bit of info into a partition, I need to double the amount of data I am storing in RAM)
If Microsoft can't do anything about it, and are clear about partition limits, then I'll think about re architecturing my solution. Having some system already deployed, it will also mean redeploying every thing with the new partitioning strategy, and reindexing tons of data, which will take lots of time.

As I said, storage library can't do anything about it. The storage library sends the request as it should do, but can't influence how much time the storage server will take for the insert. Fiddler screenshot in previous post prove that.

Edited by NicolasDorier Sunday, January 11, 2015 2:39 AM

Free Windows Admin Tool Kit Click here and download it now

January 11th, 2015 2:38am

It seems to me that this isn't really a real world scenario...  What situation could possibly clamp to force only one PartitionKey, with 54 million rows underneath?How do you plan to find a single or group of records in that mess?

I have sparse data, which mean that very few of that is accessed, but when they are accessed, I do a range query which is really fast in the partition. I had initially 255 partition.

The scenario is indexing the Bitcoin Blockchain. There is more than 300 000 million line to index, and it increase every day. I need to insert all that data the most efficiently way I can, without breaking my RAM. Having 255 partitions was good enough for me and took only 100 MB of RAM. Improving that means I need to store more entities in RAM during the insert. (now taking 10 bytes, which is 400 MB in RAM)

My query load is very low. Maybe one query per minutes, so I don't care about throttling once everything is indexed.

I am fine with microsoft not supporting such a big partition.... but they should document it, because it messed up my environment suddenly, and made me reindex everything from scratch.
I can adapt to smaller partition, but the partitioning strategy is an architectural decision that can have an high impact if you need to change it.
You don't change 400 000 million line by clasping hands.

Edited by NicolasDorier Thursday, January 15, 2015 8:11 PM

January 15th, 2015 8:08pm

Try querying your 54 million entities as you would in your app before you go any further.

It works, no problem, near instant since it is range queries. I am well aware about how to optimize queries, and never had problem with it. My problem is about insert and magic threshold.

Read the scaling and performance docs for azure, and their best practices.  Watch their Channel 9 videos.  Check out the MSDN magazine blogs, such as this one.  Read about partitioning and scaling and continuation tokens.

Read it all, listened to all, and no one ever mentioned that you should not put 54 million entities in a partition.
What they say is that you should not query more than 2000 entity per seconds per partition. And I don't.
Nothing says that the time of insert depends on the size of the partition.

I even quote Microsoft "In general, write and query times are less affected by how the table is partitioned." of Niranjan Nilakantan of Microsoft (Source)

It is not like I never used Azure, and don't know how the storage works. Using it in prod for several year, and I am even a Microsoft Certified Trainer on it. The data is of several TB.

What I complain is that the limits are not documented, and worse, depends on the location of your storage region. As I said, changing my partition strategy impact the RAM... If I don't want to impact the RAM, I need to make small batches, which make more transactions. More transactions make my perf suffer from more latency as well as the price.

West europe is a shitty region.

I did not chose my partition at random.

Worse the problem in west europe came after 5 million line, at 100 bytes each which is 500 MB... hardly big data. The problem is on the number of line, not the size.

Edited by NicolasDorier Friday, January 16, 2015 1:09 AM

Free Windows Admin Tool Kit Click here and download it now

January 16th, 2015 12:59am

Re : I guess I just assumed that based on other laws of storage. It makes sense that storing items in an existing structure containing 1 million would be faster than storing items in an existing group of 1 billion.

It does not. Since range queries are fast, it means that internally microsoft is surely using binary trees to store items in a partition. A binary tree complexity for insert is O(Log(n)), so I should get Log(n) time for inserts, not a WTF bump after magic threshold.

It would makes sense that for big partitions in size, the storage node is split after some threshold... but geez 500MB is not the end of the world for one storage node.... Splitting would make my perf slower for some times, until the split is finished, and the second storage node is operational. (I should take a look at ms white paper)

Re : The individual developer would have to take their need and build a POC of a variety of different methods, then find the best one for them.

And I did ! the magic started happening suddenly, not at the time of my POC.

I'll try to make a call, or asking some connection that can get contact with ms.

[UPDATE]

reading the white paper of Storage http://www-bcf.usc.edu/~minlanyu/teach/csci599-fall12/papers/11-calder.pdf to solve the mystery...

[/UPDATE]

[Conclusions]

West Europe is shit, don't put your big data there. Thanks to not mark this as a response since microsoft did not gave any solution nor response.

[/Conclusions]

Marked as answer by Jambor yaoMicrosoft contingent staff, Moderator Wednesday, January 28, 2015 8:53 AM
Unmarked as answer by NicolasDorier Wednesday, January 28, 2015 12:15 PM
Edited by NicolasDorier Wednesday, January 28, 2015 12:17 PM

January 16th, 2015 5:06pm

The larger the partition gets, random writes may become relatively slower. Small partitions are better for performance and scale as we can load balance it to meet the scale of your traffic. That said, it will be good to get your storage account to confirm what you are seeing is due to just large partition or some other reason. If you can send an email to ascl@microsoft.com with your account name and approximate timeline of this slowness, we can have a look at it. Thanks.

Proposed as answer by Michael CurdMicrosoft employee, Moderator Monday, March 09, 2015 9:08 PM
Unproposed as answer by NicolasDorier Tuesday, March 10, 2015 1:25 AM

Free Windows Admin Tool Kit Click here and download it now

March 9th, 2015 9:08pm

I sent a mail but it got rejected. (said not authenticated)
So I copy here

Please don't mark a response as answer when no solution has been provided.

I contact you for the scaling problem noticed in West Europe and detailed here : https://social.msdn.microsoft.com/Forums/azure/en-US/dbde4333-26a7-44b5-a2a6-8d373dd12d89/azure-table-inserts-do-not-scale?forum=windowsazuredata

I repeat myself : I am not affected in North Europe until only way way way more data is inserted. On North Europe the time it takes for a batch insert goes up linearly after way much more data is inserted. Again please read carefully my thread on the forum, I documented that carefully. Microsoft tends to give copy/pasted responses on the forum after reading the first 3 words, without reading the content. Please don't waste the time I spent documenting the problem. The difference of performance behavior between different regions, clearly led me to advice some region over others to my customers depending, not on the geographic position of their users, but on the quality of their storage of the region. The state of West Europe storage is BAD, not advisable for production for some of my customers. The performance has gotten even worse since the time I posted on the forum. I reproduced the problem on the storage account "nbitcoin".

A tool running my test program is running right now and it takes 3 sec per batch to insert. I noticed the same pattern I'm talking about on the thread.

This graph show the insert time performance variation after 21570 batches in one partition (or 2.157.000 entities) The same graph in North Europe is going up linearly after more than 150.000 batch.

Entity size is 32 bytes approximately. (total approx 65 MB, which is hardly "a big partition")

Now Microsoft Please : either do not respond to this thread, either respond after having read my actual complaint COMPLETELY. But do not mark it solved until you have an answer about why west Europe sucks so much compared to north Europe.

I am very pleased and excited by Azure products, but the way questions are handled on this forum has been a big waste of time until now, and it is not the first time it happens.

Sorry if it seems to be aggressive, but it hurts spending so much time documenting your flaws and see it ignored with a copy/pasted response you use to give everywhere.

Edited by NicolasDorier Tuesday, March 10, 2015 1:41 AM

March 10th, 2015 1:38am

My apologies for the inconvenience. I have a different contact that should be able to help you out with your problem Manish.Chablani@microsoft.com. Please include me on the TO line as well micurd@microsoft.com. Thanks.

Free Windows Admin Tool Kit Click here and download it now

March 12th, 2015 1:56pm

This topic is archived. No further replies will be accepted.