|
This article originally
appeared on www.embedded.com
in January 2006 in the Embedded Soapbox Section
Multiprocessors Need Distributed Objects
After decades of anticipation,
multiprocessors are becoming a fact of life for most software engineers,
especially in embedded systems. Symmetric multiprocessors which
use a number of identical processors, one operating system and a
shared memory are relatively easy to program, because we can use
additional processors to execute additional threads. (But watch
out for multi-threaded software that assumes its running on a uniprocessor,
so when a high priority thread is running, a low priority thread
isn't.)
Asymmetric multiprocessors,
which lack shared memory, have dissimilar processors or run different
operating systems on different processors are more difficult to
program, but increasingly common. Threads of execution on different
processors are isolated from each other and cannot directly access
the same data. The software must ensure that threads have the data
they need when they need it. Results may be affected by the order
in which events occur on different processors.
Asymmetric multiprocessors
are effectively distributed systems and some of the tools and techniques
of distributed systems can help us address these problems. One example
is a computational model involving threads communicating by exchanging
messages using inter-process communications (IPC). This model is
attractive because it reflects the underlying reality, but it has
limitations. It introduces a new basis of software partitioning
- the message - in addition to the familiar concepts of function
calls and member function calls. It forces the developer to deal
with remote processing and local processing in different terms and
to make up front decisions about what can and cannot be accomplished
remotely. It also forces early decisions on where different processing
is done.
Consider what happens
if we start a high level design by partitioning an application into
threads exchanging messages. When a message is passed to a thread
on the local processor, a message passing overhead (albeit reduced)
is incurred, so message passing must be used sparingly and only
where remote processing is a possibility. The messaging schema determines
the data available to each thread and vice versa. This approach
commits us to the location of data and of processing before we have
a good understanding of our new design's behavior. If we get it
wrong, we have to start over. If the hardware environment changes,
we have to start over.
A "distributed
object model" is an alternative to the threads and messages
approach. It brings all the benefits of object oriented programming
and adds some benefits of its own. It divides application data into
discrete objects, allowing us to assign processing to different
processors by locating objects on them. A logical thread of execution
visits whatever processor it needs to, in order to complete its
work. The message passing layer of software is hidden inside remote
method calls, which are semantically and syntactically identical
to local method calls. The message passing layer can be automatically
generated from method signatures and any method in any class can
potentially be called remotely with a corresponding message. Local
method calls incur no message passing overhead.
Consider what happens
if we model an application in terms of distributed objects instead
of threads and messages. A detailed system model can be built in
terms of objects and methods without undue attention to object location.
(I say undue attention because object location must be borne in
mind in key scenarios, but need not be completely specified at the
outset.) A method call is a method call and we don't worry too much
about whether it's local or remote. The message passing layer is
a consequence of the classes and methods that are remotely accessed,
which in turn are consequences of object location. Object location
can be decided late and changed easily. When it is decided or changed,
the message passing layer can be automatically generated, not hand
written. So distributed object design methodology can reduce the
cost and increase the flexibility of a distributed application.
Load balancing and ports to different hardware architectures can
both be addressed by reassigning objects to different processors.
Distributed object
middleware has been very successful in traditional distributed processing.
CORBA, COM and Java RMI are well known implementations of the paradigm.
But these technologies are designed for big computers with big operating
systems on slow networks. They focus largely on defining and implementing
interfaces between subsystems, with each subsystem confined to an
individual processor. Used carelessly, they give disappointing results.
The multicore environment
demands distributed object middleware as small and as fast as hand-optimized
message passing. It must be highly transparent to allow most objects
to be remotely accessed, not just a few. Remote invocation needs
to be like walking into another room, not visiting another house.
If this can be achieved, distributed object technology can greatly
reduce the difficulty, risk and cost of software development for
asymmetric multiprocessors.
For a discussion of
the issues of embedded distributed processing, see [???]
For a description of
redFOX, our answer to the problems of embedded distributed processing,
see the
Fast Facts page.
|