high_scalability high_scalability-2009 high_scalability-2009-636 knowledge-graph by maker-knowledge-mining

636 high scalability-2009-06-23-Learn How to Exploit Multiple Cores for Better Performance and Scalability


meta infos for this blog

Source: html

Introduction: InfoQueue has this excellent talk by Brian Goetz on the new features being added to Java SE 7 that will allow programmers to fully exploit our massively multi-processor future. While the talk is about Java it's really more general than that and there's a lot to learn here for everyone. Brian starts with a short, coherent, and compelling explanation of why programmers can't expect to be saved by ever faster CPUs and why we must learn to exploit the strengths of multiple core computers to make our software go faster. Some techniques for exploiting multiple cores are given in an equally short, coherent, and compelling explanation of why divide and conquer as the secret to multi-core bliss, fork-join, how the Java approach differs from map-reduce, and lots of other juicy topics. The multi-core "problem" is only going to get worse. Tilera founder Anant Agarwal estimates by 2017 embedded processors could have 4,096 cores, server CPUs might have 512 cores and desktop chips could use


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Brian starts with a short, coherent, and compelling explanation of why programmers can't expect to be saved by ever faster CPUs and why we must learn to exploit the strengths of multiple core computers to make our software go faster. [sent-3, score-0.465]

2 Some techniques for exploiting multiple cores are given in an equally short, coherent, and compelling explanation of why divide and conquer as the secret to multi-core bliss, fork-join, how the Java approach differs from map-reduce, and lots of other juicy topics. [sent-4, score-0.815]

3 Tilera founder Anant Agarwal estimates by 2017 embedded processors could have 4,096 cores, server CPUs might have 512 cores and desktop chips could use 128 cores. [sent-6, score-0.397]

4 Some disagree saying this is too optimistic, but Agarwal maintains the number of cores will double every 18 months. [sent-7, score-0.418]

5 The number of cores is increasing so applications must now search for fine grain parallelism (fork-join) As hardware becomes more parallel, more and more cores, software has to look for techniques to find more and more parallelism to keep the hardware busy. [sent-13, score-1.187]

6 Allowed programmers to be lazy because a faster processor would be released that saved your butt. [sent-15, score-0.327]

7 So more processing power could be put on a chip which leads to putting more and more processing cores on a chip. [sent-30, score-0.45]

8 The number of cores will grow at exponential rate for the next 10 years. [sent-33, score-0.418]

9 The problem is it's harder to make a program go faster on a multicore system. [sent-38, score-0.474]

10 If you have a 100 cores you program won't go faster unless you explicitly design it to take advantage of those chips. [sent-40, score-0.595]

11 Must now be able to partition your program so it can run faster by running on multiple cores. [sent-42, score-0.282]

12 And you must be able keep doing that as the number of cores keeps improving. [sent-43, score-0.492]

13 Started off with course grain tasks which was sufficient given the number of cores. [sent-46, score-0.345]

14 This approach won't work as the number cores increase. [sent-47, score-0.508]

15 The course grained threading approach is to use a thread pool, divide up the numbers, and let the task pool compute the sub problems. [sent-59, score-0.996]

16 A shared task pool is slow as the number increases which forces the work to be more course grained. [sent-60, score-0.429]

17 Had to decide up front how many pieces to divide the problem into. [sent-65, score-0.314]

18 The fork-join pool is optimized for fine grained operations whereas the thread pool is optimized for course grained operations. [sent-77, score-0.93]

19 This approach scales nearly linearly with the number of hardware threads. [sent-81, score-0.277]

20 The goal for fork-join: Avoid context switches; Have as many threads as hardware threads and keep them all busy; Minimize queue lock contention for data structures. [sent-82, score-0.438]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('cores', 0.313), ('divide', 0.235), ('faster', 0.168), ('pool', 0.162), ('grain', 0.155), ('parallelism', 0.153), ('chip', 0.137), ('thread', 0.132), ('thinner', 0.127), ('grained', 0.122), ('program', 0.114), ('multicore', 0.113), ('cpus', 0.112), ('parallel', 0.107), ('lunch', 0.106), ('number', 0.105), ('merge', 0.101), ('conquer', 0.101), ('threads', 0.098), ('fork', 0.098), ('coherent', 0.096), ('searching', 0.095), ('sub', 0.093), ('contention', 0.092), ('rick', 0.091), ('approach', 0.09), ('sorting', 0.087), ('ghz', 0.087), ('course', 0.085), ('processor', 0.085), ('processors', 0.084), ('brian', 0.083), ('hardware', 0.082), ('java', 0.08), ('problem', 0.079), ('task', 0.077), ('compelling', 0.076), ('moore', 0.075), ('must', 0.074), ('clock', 0.074), ('whereas', 0.074), ('programmers', 0.074), ('exploit', 0.073), ('computations', 0.072), ('coordination', 0.071), ('operations', 0.071), ('increasing', 0.07), ('tail', 0.068), ('specify', 0.068), ('queue', 0.068)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0000002 636 high scalability-2009-06-23-Learn How to Exploit Multiple Cores for Better Performance and Scalability

Introduction: InfoQueue has this excellent talk by Brian Goetz on the new features being added to Java SE 7 that will allow programmers to fully exploit our massively multi-processor future. While the talk is about Java it's really more general than that and there's a lot to learn here for everyone. Brian starts with a short, coherent, and compelling explanation of why programmers can't expect to be saved by ever faster CPUs and why we must learn to exploit the strengths of multiple core computers to make our software go faster. Some techniques for exploiting multiple cores are given in an equally short, coherent, and compelling explanation of why divide and conquer as the secret to multi-core bliss, fork-join, how the Java approach differs from map-reduce, and lots of other juicy topics. The multi-core "problem" is only going to get worse. Tilera founder Anant Agarwal estimates by 2017 embedded processors could have 4,096 cores, server CPUs might have 512 cores and desktop chips could use

2 0.207974 317 high scalability-2008-05-10-Hitting 300 SimbleDB Requests Per Second on a Small EC2 Instance

Introduction: High Performance Multithreaded Access to Amazon SimpleDB is a great follow up to the idea in How SimpleDB Differs from a RDBMS that more programming is the price paid for performance in SimpleDB. It shows how much work and infrastructure is required to batter better performance out of SimpleDB. Remember, in SimpleDB you get keys to records from queries so if you want to get all the fields for records you need to make separate requests. Since SimpleDB isn't exactly a speed daemon the obvious strategy is to parallelize. Even if a job takes a 100 msecs you can get a lot done in a little time if you can execute enough jobs in parallel. Parallelization is the approach taken by Haakon@AWS in his Java code example of how to get the most out of SimpleDB. You can find the code at Indexing and Querying Amazon S3 Metadata with Amazon SimpleDB . We'll also consider how a back-end service architecture built on Erlang may be a better fit with cloud computing. Two general mechanisms

3 0.2035179 612 high scalability-2009-05-31-Parallel Programming for real-world

Introduction: Multicore computers shift the burden of software performance from chip designers and architects to software developers. What is the parallel Computing ? and what the different between Multi-Threading and Concurrency and Parallelism ? and what is differences between task and data parallel ? and how we can use it ? Fundamental article into Parallel Programming...

4 0.1983602 1276 high scalability-2012-07-04-Top Features of a Scalable Database

Introduction: This is a guest post by Douglas Wilson, EMEA Field Application Engineer at Raima, based on insights from biulding their Raima Database Manager . Scalability and Hardware Scalability is the ability to maintain performance as demands on the system increase, by adding further resources. Normally those resources will be in the form of hardware. Since processor speeds are no longer increasing much, scaling up the hardware normally means adding extra processors or cores, and more memory. Scalability and Software However, scalability requires software that can utilize the extra hardware effectively. The software must be designed to allow parallel processing. In the context of a database engine this means that the server component must be  multi-threaded , to allow the operating system to schedule parallel tasks on all the cores that are available. Not only that, but the database engine must provide an efficient way to break its workload into as many parallel tasks as there

5 0.18161356 534 high scalability-2009-03-12-Google TechTalk: Amdahl's Law in the Multicore Era

Introduction: Over the last several decades computer architects have been phenomenally successful turning the transistor bounty provided by Moore's Law into chips with ever increasing single-threaded performance. During many of these successful years, however, many researchers paid scant attention to multiprocessor work. Now as vendors turn to multicore chips, researchers are reacting with more papers on multi-threaded systems. While this is good, we are concerned that further work on single-thread performance will be squashed. To help understand future high-level trade-offs, we develop a corollary to Amdahl's Law for multicore chips [Hill & Marty, IEEE Computer 2008]. It models fixed chip resources for alternative designs that use symmetric cores, asymmetric cores, or dynamic techniques that allow cores to work together on sequential execution. Our results encourage multicore designers to view performance of the entire chip rather than focus on core efficiencies. Moreover, we observe that obtai

6 0.17761095 1204 high scalability-2012-03-06-Ask For Forgiveness Programming - Or How We'll Program 1000 Cores

7 0.16749999 1456 high scalability-2013-05-13-The Secret to 10 Million Concurrent Connections -The Kernel is the Problem, Not the Solution

8 0.15657867 1425 high scalability-2013-03-18-Beyond Threads and Callbacks - Application Architecture Pros and Cons

9 0.15603702 953 high scalability-2010-12-03-GPU vs CPU Smackdown : The Rise of Throughput-Oriented Architectures

10 0.15467322 575 high scalability-2009-04-21-Thread Pool Engine in MS CLR 4, and Work-Stealing scheduling algorithm

11 0.15443255 768 high scalability-2010-02-01-What Will Kill the Cloud?

12 0.15303691 459 high scalability-2008-12-03-Java World Interview on Scalability and Other Java Scalability Secrets

13 0.15263426 1319 high scalability-2012-09-10-Russ’ 10 Ingredient Recipe for Making 1 Million TPS on $5K Hardware

14 0.14846334 750 high scalability-2009-12-16-Building Super Scalable Systems: Blade Runner Meets Autonomic Computing in the Ambient Cloud

15 0.14829338 1355 high scalability-2012-11-05-Gone Fishin': Building Super Scalable Systems: Blade Runner Meets Autonomic Computing In The Ambient Cloud

16 0.14738236 778 high scalability-2010-02-15-The Amazing Collective Compute Power of the Ambient Cloud

17 0.13923918 608 high scalability-2009-05-27-The Future of the Parallelism and its Challenges

18 0.13878159 661 high scalability-2009-07-25-Latency is Everywhere and it Costs You Sales - How to Crush it

19 0.13838528 1536 high scalability-2013-10-23-Strategy: Use Linux Taskset to Pin Processes or Let the OS Schedule It?

20 0.1350528 1429 high scalability-2013-03-25-AppBackplane - A Framework for Supporting Multiple Application Architectures


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.223), (1, 0.124), (2, -0.006), (3, 0.055), (4, -0.036), (5, 0.047), (6, 0.058), (7, 0.126), (8, -0.163), (9, -0.014), (10, 0.005), (11, -0.012), (12, 0.068), (13, 0.026), (14, -0.004), (15, -0.058), (16, -0.006), (17, 0.009), (18, -0.026), (19, 0.083), (20, 0.001), (21, -0.083), (22, -0.069), (23, -0.014), (24, 0.028), (25, -0.045), (26, 0.006), (27, 0.009), (28, 0.159), (29, 0.041), (30, 0.036), (31, 0.089), (32, -0.026), (33, -0.029), (34, 0.029), (35, -0.081), (36, 0.058), (37, 0.023), (38, 0.037), (39, 0.033), (40, -0.017), (41, 0.057), (42, -0.09), (43, -0.048), (44, 0.011), (45, -0.006), (46, 0.049), (47, -0.046), (48, 0.032), (49, 0.033)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.96407139 636 high scalability-2009-06-23-Learn How to Exploit Multiple Cores for Better Performance and Scalability

Introduction: InfoQueue has this excellent talk by Brian Goetz on the new features being added to Java SE 7 that will allow programmers to fully exploit our massively multi-processor future. While the talk is about Java it's really more general than that and there's a lot to learn here for everyone. Brian starts with a short, coherent, and compelling explanation of why programmers can't expect to be saved by ever faster CPUs and why we must learn to exploit the strengths of multiple core computers to make our software go faster. Some techniques for exploiting multiple cores are given in an equally short, coherent, and compelling explanation of why divide and conquer as the secret to multi-core bliss, fork-join, how the Java approach differs from map-reduce, and lots of other juicy topics. The multi-core "problem" is only going to get worse. Tilera founder Anant Agarwal estimates by 2017 embedded processors could have 4,096 cores, server CPUs might have 512 cores and desktop chips could use

2 0.78016955 1218 high scalability-2012-03-29-Strategy: Exploit Processor Affinity for High and Predictable Performance

Introduction: Martin Thompson wrote a really interesting article on the beneficial performance impact of taking advantage of  Processor Affinity : The interesting thing I've observed is that the unpinned test will follow a step function of unpredictable performance.  Across many runs I've seen different patterns but all similar in this step function nature.  For the pinned tests I get consistent throughput with no step pattern and always the greatest throughput. The idea is by assigning a thread to a particular CPU that when a thread is rescheduled to run on the same CPU, it can take advantage of the "accumulated  state in the processor, including instructions and data in the cache."  With multi-core chips the norm now, you may want to decide for yourself how to assign work to cores and not let the OS do it for you. The results are surprisingly strong.

3 0.7729454 1536 high scalability-2013-10-23-Strategy: Use Linux Taskset to Pin Processes or Let the OS Schedule It?

Introduction: This question comes from Ulysses on an interesting thread from the Mechanical Sympathy news group, especially given how multiple processors are now the norm: Ulysses: On an 8xCPU Linux instance,  is it at all advantageous to use the Linux taskset command to pin an 8xJVM process set (co-ordinated as a www.infinispan.org distributed cache/data grid) to a specific CPU affinity set  (i.e. pin JVM0 process to CPU 0, JVM1 process to CPU1, ...., JVM7process to CPU 7) vs. just letting the Linux OS use its default mechanism for provisioning the 8xJVM process set to the available CPUs? In effrort to seek an optimal point (in the full event space), what are the conceptual trade-offs in considering "searching" each permutation of provisioning an 8xJVM process set to an 8xCPU set via taskset? Given taskset  is they key to the question, it would help to have a definition: Used to set or retrieve the CPU affinity of a running process given its PID or to launch a new COMMAND with

4 0.77034289 317 high scalability-2008-05-10-Hitting 300 SimbleDB Requests Per Second on a Small EC2 Instance

Introduction: High Performance Multithreaded Access to Amazon SimpleDB is a great follow up to the idea in How SimpleDB Differs from a RDBMS that more programming is the price paid for performance in SimpleDB. It shows how much work and infrastructure is required to batter better performance out of SimpleDB. Remember, in SimpleDB you get keys to records from queries so if you want to get all the fields for records you need to make separate requests. Since SimpleDB isn't exactly a speed daemon the obvious strategy is to parallelize. Even if a job takes a 100 msecs you can get a lot done in a little time if you can execute enough jobs in parallel. Parallelization is the approach taken by Haakon@AWS in his Java code example of how to get the most out of SimpleDB. You can find the code at Indexing and Querying Amazon S3 Metadata with Amazon SimpleDB . We'll also consider how a back-end service architecture built on Erlang may be a better fit with cloud computing. Two general mechanisms

5 0.76978821 953 high scalability-2010-12-03-GPU vs CPU Smackdown : The Rise of Throughput-Oriented Architectures

Introduction: In some ways the original Amazon cloud, the one most of us still live in, was like that really cool house that when you stepped inside and saw the old green shag carpet in the living room, you knew the house hadn't been updated in a while. The network is a little slow, the processors are a bit dated, and virtualization made the house just feel smaller. It has been difficult to run high bandwidth or low latency workloads in the cloud. Bottlenecks everywhere. Not a big deal for most applications, but for many high performance applications (HPC) it was a killer. In a typical house you might just do a remodel. Upgrade a few rooms. Swap out builder quality appliances with gleaming stainless steel monsters. But Amazon has a big lot, instead of remodeling they simply keep adding on entire new wings, kind of like the  Winchester Mystery House of computing. The first new wing added was a CPU based HPC system  featuring blazingly fast Nehalem chips , virtualization replaced by a close t

6 0.75741279 1319 high scalability-2012-09-10-Russ’ 10 Ingredient Recipe for Making 1 Million TPS on $5K Hardware

7 0.7501868 575 high scalability-2009-04-21-Thread Pool Engine in MS CLR 4, and Work-Stealing scheduling algorithm

8 0.75011641 1454 high scalability-2013-05-08-Typesafe Interview: Scala + Akka is an IaaS for Your Process Architecture

9 0.74936086 735 high scalability-2009-11-01-Squeeze more performance from Parallelism

10 0.74016589 581 high scalability-2009-04-26-Map-Reduce for Machine Learning on Multicore

11 0.73526782 1204 high scalability-2012-03-06-Ask For Forgiveness Programming - Or How We'll Program 1000 Cores

12 0.73098624 534 high scalability-2009-03-12-Google TechTalk: Amdahl's Law in the Multicore Era

13 0.72170788 505 high scalability-2009-02-01-More Chips Means Less Salsa

14 0.71976286 1237 high scalability-2012-05-02-12 Ways to Increase Throughput by 32X and Reduce Latency by 20X

15 0.71651268 826 high scalability-2010-05-12-The Rise of the Virtual Cellular Machines

16 0.71032172 1425 high scalability-2013-03-18-Beyond Threads and Callbacks - Application Architecture Pros and Cons

17 0.69744909 1641 high scalability-2014-05-01-Paper: Can Programming Be Liberated From The Von Neumann Style?

18 0.69571418 1456 high scalability-2013-05-13-The Secret to 10 Million Concurrent Connections -The Kernel is the Problem, Not the Solution

19 0.68144077 1591 high scalability-2014-02-05-Little’s Law, Scalability and Fault Tolerance: The OS is your bottleneck. What you can do?

20 0.67427838 1541 high scalability-2013-10-31-Paper: Everything You Always Wanted to Know About Synchronization but Were Afraid to Ask


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(1, 0.157), (2, 0.245), (5, 0.014), (10, 0.061), (30, 0.018), (40, 0.029), (44, 0.103), (61, 0.043), (77, 0.039), (79, 0.117), (85, 0.051), (94, 0.037)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.95780373 1177 high scalability-2012-01-19-Is it time to get rid of the Linux OS model in the cloud?

Introduction: You program in a dynamic language, that runs on a JVM, that runs on a OS designed 40 years ago  for a completely different purpose, that runs on virtualized hardware. Does this make sense? We've talked about this idea before in Machine VM + Cloud API - Rewriting The Cloud From Scratch , where the vision is to treat cloud virtual hardware as a compiler target, and converting high-level language source code directly into kernels that run on it. As new technologies evolve the friction created by our old tool chains and architecture models becomes ever more obvious. Take, for example, what a team at UCSD  is releasing: a phase-change memory prototype   - a   solid state storage device that provides performance thousands of times faster than a conventional hard drive and up to seven times faster than current state-of-the-art solid-state drives (SSDs). However, PCM has access latencies several times slower than DRAM. This technology has obvious mind blowing implications, but an

same-blog 2 0.95533043 636 high scalability-2009-06-23-Learn How to Exploit Multiple Cores for Better Performance and Scalability

Introduction: InfoQueue has this excellent talk by Brian Goetz on the new features being added to Java SE 7 that will allow programmers to fully exploit our massively multi-processor future. While the talk is about Java it's really more general than that and there's a lot to learn here for everyone. Brian starts with a short, coherent, and compelling explanation of why programmers can't expect to be saved by ever faster CPUs and why we must learn to exploit the strengths of multiple core computers to make our software go faster. Some techniques for exploiting multiple cores are given in an equally short, coherent, and compelling explanation of why divide and conquer as the secret to multi-core bliss, fork-join, how the Java approach differs from map-reduce, and lots of other juicy topics. The multi-core "problem" is only going to get worse. Tilera founder Anant Agarwal estimates by 2017 embedded processors could have 4,096 cores, server CPUs might have 512 cores and desktop chips could use

3 0.95030963 366 high scalability-2008-08-17-Many updates against MySQL

Introduction: Hello! My first post here, so be patient please. I am developing site where I have lots of static content. But on many pages I have query to update count of views. I would say this is may cause lots of problems and was interested in another solution like storing these counts somewhere else. As my knowledge is bit limited in this way, I am asking you. I can say I understand PHP(OOP ofc) and MySQL. Nowadays I am getting into servers. Other question I have is: I read about making lots of things static.(in Flickr Architecture) and am interested how they do static sites? Lets say they make photo page static? And rebuild when tagg or comment is added? I am bit interested in it as I want to learn Smarty better(newbie) and serving content. Moreover, how about PHP? I have read many books about PHP theoretically but would love to see some RL example of using objects and exceptions(mainly this as I don't completely understand it) to learn some good programming habits. So if you can help

4 0.94809467 927 high scalability-2010-10-26-Marrying memcached and NoSQL

Introduction: Memcached  is one of the most common In-Memory cache implementation.  It was originally developed by  Danga Interactive  for  LiveJournal , but is now used by many other sites as a  side cache  to speed up read mostly operations. It gained popularity in the non-Java world, too, especially since it’s a language-neutral side cache for which few alternatives existed.   As a side-cache, Memcache clients relies on the database as the system of record, The database is still used for write,update and complex query operations.  Since the  memcached specification includes no query operations, memcached is not a database alternative, unlike most of the NoSQL offerings. It also exclude memcache from being a real solution for write scalability. As a result of that many of the heavy sites started to move away from Memcache and replace it with other NoSQL alternatives as noted in a recent highscalability post  MySQL And Memcached: End Of An Era? The transition away from memcached to NoSQL

5 0.94750053 660 high scalability-2009-07-21-Paper: Parallelizing the Web Browser

Introduction: There have been reports that software engineering is dead . Maybe, like the future, software engineering is simply not evenly distributed? When you read this paper I think you'll agree there is some real engineering going on, it's just that most of the things we need to build do not require real engineering. Much like my old childhood tree fort could be patched together and was "good enough." This brings to mind the old joke: If a software tree falls in the woods would anyone hear it fall? Only if it tweeted on the way down... What this paper really showed me is we need not only to change programming practices and constructs, but we also need to design solutions that allow for deep parallelism to begin with. Grafting parallelism on later is difficult. Parallel execution requires knowing precisely how components are dependent on each other and that level of precision tends to go far beyond the human attention span. In particular this paper deals with how to parallelize the browser on

6 0.94543153 1602 high scalability-2014-02-26-The WhatsApp Architecture Facebook Bought For $19 Billion

7 0.94291472 817 high scalability-2010-04-29-Product: SciDB - A Science-Oriented DBMS at 100 Petabytes

8 0.94206011 1537 high scalability-2013-10-25-Stuff The Internet Says On Scalability For October 25th, 2013

9 0.94203347 674 high scalability-2009-08-07-The Canonical Cloud Architecture

10 0.94189221 514 high scalability-2009-02-18-Numbers Everyone Should Know

11 0.94010544 274 high scalability-2008-03-12-YouTube Architecture

12 0.93969971 1000 high scalability-2011-03-08-Medialets Architecture - Defeating the Daunting Mobile Device Data Deluge

13 0.93954277 671 high scalability-2009-08-05-Stack Overflow Architecture

14 0.9394837 847 high scalability-2010-06-23-Product: dbShards - Share Nothing. Shard Everything.

15 0.93936431 1643 high scalability-2014-05-06-The Quest for Database Scale: the 1 M TPS challenge - Three Design Points and Five common Bottlenecks to avoid

16 0.93929207 1447 high scalability-2013-04-26-Stuff The Internet Says On Scalability For April 26, 2013

17 0.93870759 1348 high scalability-2012-10-26-Stuff The Internet Says On Scalability For October 26, 2012

18 0.93858731 857 high scalability-2010-07-13-DbShards Part Deux - The Internals

19 0.93844312 1171 high scalability-2012-01-09-The Etsy Saga: From Silos to Happy to Billions of Pageviews a Month

20 0.93836236 378 high scalability-2008-09-03-Some Facebook Secrets to Better Operations