Section 3: Module 4: Part 1: The Proxy Pattern (Part 1)

what we’re going to talk about now is something that makes a lot more sense once we’ve gotten to the point where we’re discussing aidl and binder RPC because it’s something that really figures prominently in that context so we’re going to talk about here is we’re going to start talking about the proxy pattern as will see the proxy pattern is a pattern that appeared in a couple of different places it appeared in the Gang of Four book where it’s focusing primarily on a very simplistic way of doing proxies and then later we’ll see it appeared in the posa one book where it was described a bit more generally focusing on communication occurs in environments where you’re going across address spaces so we’re going to kind of merge these two things together and we’ll talk first about the pattern and then we’ll talk about how the pattern gets applied in the context of Android so what’s the context well the context is in most environments where you have to cross address spaces it’s not really feasible to access the internal details of an object directly if it’s ana server process you just can’t get to it from the client you’re separated from an abstraction boundary that could be a logical boundary like you might have with said Java or it might actually be a physical quote physical boundary which is protected by the hardware where you have things running in separate processes well whatever the case is you can’t get to something directly and of course Android provides this binder RPC mechanism which we can use to communicate across the backplane of the device to talk between processes that are always on the same device but in different address spaces another part of the context here is that as you start to develop your system over time it’s often the case that you change your mind about where you want your objects to live now this is really easy to motivate when we start talking about a distributed system in a distributed system when you’ve got networks involved you might start out with things preps co-located in the same address space or on the same machine for various reasons simplicity of programming is the most obvious one and then as your customers start to scale up or scale out you want to add more and more and more processing to your to your solution so rather than putting everything together in one address space or in one machine now you’ve got multiple machines and so in that case you may start changing where things reside over time as your system has to grow or as you do profiling to learn where the hot spots are where it slows down and so on and so forth but the point is that things are going to change over time so what are the problems well one problem is that manually writing code to Marshall and D Marshall communication between address spaces is very tedious error-prone and just non-portable it’s just a real pain nobody wants to write that code it’s hard to maintain its hard to optimize so you’d really like not to have to do that yourself the second problem is if you will to rewrite a lot of code every time they read configure and redeploy their software to break things up into different partitioning boundaries that’s a lot of work and if you program with lower level IPC mechanisms like sockets for example the amount of work required to change from communicating between things in the same address space two things on the same machine a different processes two different processes and different machines it’s a lot of work to do that by hand and so most people just give up very quickly they say the heck with its good enough I’m not going to worry about it so these are the kinds of impediments that we run into when we start trying to build systems that have to be broken up across different address boundaries so what’s the solution well as you’ll see there’s a multi-part solution here but one piece of the solution is to define something called a proxy and the proxy essentially acts as a surrogate or an ambassador that gives a particular interface actually the same interface to the client and then allows the implementation of the interface to be someplace else and that someplace else could be in a different address space on the same machine it might be in the operating system kernel being accessed by an application running and user level it might be another process on another machine we don’t really know we don’t really care all we know is that the proxy is allowing us to abstract and encapsulate the details of where things actually reside and that will allow us a lot more flexibility to move things around so here’s how you might do this on Android you might go ahead and and inherit a remote object from a binder has we’ve been talking about many times and this binder then allows you to be able to export that object to the to the receiver side of course the way in which you get a proxy to this thing is to describe the interface to this via an aidl file which says what the methods are and their signatures and so on and then you run that through the compiler you get the proxy which is generated automatically for you and then when a client invokes a method call on the

proxy it thinks it’s accessing something that’s local it says you know service dot download image or download file or whatever it looks like it’s making a local call when in fact it’s actually calling something else where now where that thing resides you can control an Android by changing various directives in the manifest file so you can control whether or not the thing is in the same address space or whether it’s going to be in a different address space you get some control over that but the point is that whether it’s in the same or different the proxy is shielding you from what’s actually taking place now as we’ll see later the proxy pattern is going to work together with the Android binder RPC mechanism in order to implement the broker pattern and we’re not going to talk about that quite yet that’s going to come next but be aware that that’s that’s kind of where we headed on this thing is to talk about brokers the the proxy just a piece of that if we were thinking about this in broader pattern concepts we would say that the broker was a pattern language and the proxy was a a piece of that pattern language and we’ll props talk about that later okay so as always these are not just random assortments of classes thrown together for ad hoc reasons they they fit in some grander scheme this grander scheme has been around for quite a long time so the early work in this area goes back to the nineteen really 1980s so it’s been you know what 30 plus years at this point for a lot of this stuff that’s been around for a while and the pattern is called the broker pattern and then I’m sorry the patterns called the proxy pattern and the intenta this pattern is to be able to provide a surrogate or placeholder for an object to allow another object to access it and control access to it you take a look here for more information about the proxy pattern there’s a couple different reasons why you would use this the typical reason why you would use this pattern is if you need to be able to access an object using something that’s more powerful than a simple pointer or a simple reference pointers and references that you find in C++ and Java are great when you’re talking to objects that are in the same address space when you start having objects that reside in other dress spaces they don’t work so great because they can’t reach across those address spaces so that’s when we need to have proxies another motivation here is to make it easy to be able to change where the objects reside local versus remote or co-located versus remote without changing the way that you access them so those are the considerations that might help you um another thing you might want to do here is if you wanted to have some kind of strong type checking let me add that one just for fun say if there’s say when there’s a benefit from having strong statically typed typed method invocations and what I’m really talking about here when I’m really comparing contrasting with is the approaches that use proxies from other approaches that are based on more message passing we’ll talk about those later here’s a quick synopsis of the structure and participants in this pattern it’s pretty straightforward this is the diagram from the Gang of Four book we have a subject which is really the thing that’s that’s the defining and interface obviously in in Android that’s the thing that’s defined with a IDL we have the proxy which is the guy that is shown to the client which does the magic under the hood we’ll talk about in a second to get the method turned into something that can go over to the receiver side to the real object the real subject and in Android that’s of course handled by the generated code its stub proxy nested inside the stub class and then finally we have the quote real subject and in this case the real subject would be some combination of stub and the actual implementation in the context of Android when you start looking at this in a distributed systems lens or through a distribute systems lens you see that the Gang of Four description is pat earns a little bit meager his other steps going on here to get to the other side and turn it back into a method call but for the purposes of our discussion right now just think of proxy being the local surrogate or local place holder or local ambassador to some object than they reside someplace else and when you invoke an operation on the proxy it does stuff and then it forwards the call on to the real subject to do the work this set of dynamics diagrams kind of illustrates this point in a little bit more detail so the client starts out by getting a proxy from some place typically from a factory of some kind of factory method once it’s got the proxy it then invokes a method call on it as you can see here most of the time you’re using proxies to go across address space batteries not always but mostly and so what’s happening in this case is that the proxy is turning around and converting the

method call that was invoked on it into some kind of message stripping out or taking the bytes that are coming in as parameters and turning the objects are coming in as parameters and serializing them into linear bite sequences with a particular encoding format and then somehow and this is the part that’s not talked about by the proxy pattern it’s getting it over to the real subject to do the work and when the real subject is done the call returns typically synchronously although you can have variants that do a sickness interactions and then it takes the results back which are in a message format turns the back into whatever native format is responsible and expected by the collar and then returns back to the caller and so either through return value or out parameters or in out parameters or something the results are updated and the client gets the results back from the server so it’s basically method to message back to method again that’s kind of the way it works okay any questions about the steps either static or dynamic so there’s a number of interesting consequences of applying this pattern one of the key ones and then why it gets used so much in distribute systems is it allows you to at least at least in theory decouple the location of an object from the access to that object you can change where the object resides without changing other things you have to change the interface you don’t have a lot of extra work to do on your behalf as a client application developer or as a server application developer you can also use it to simplify tedious and error-prone details this is just a little snippet of the code that you would see if you were to peek inside of that download image proxy that’s generated for you as we looked at this before it’s kind of messy and ugly and nobody the right mind wants to write that code and maintain it by hand it’s just just awkward we’d much rather have it be done for you by the proxy writer there are also some potential overheads however this potential liabilities are downsides with this one of which is you are at the mercy of the writer of the proxy generator if the writer the proxy generated is a good job then you’re in pretty good shape it’s probably equivalent to what you could do by hand if the writer of the the generator the proxy code does a bad job you’ve got lots lots of overhead and there’s not a whole lot you can do about it unless you’re middleware gives you the ability to handcraft your own proxies and plug them in in place of the ones that are automatically generated so this is one of those topics has gone back and forth and back and forth for decades there was a time when interphase generators were really bad and so people did a lot of work on optimizing things either by hand or by writing better generators just like with everything else over time people get smarter about this stuff and so a well-written generator should be competitive to code you would write by hand in fact there was some interesting work done maybe gosh 15 years ago or so by some folks at university of utah on a technique called flick which was an ideal compiler that was heavily optimized to generate code that be very fast and it used programming language transformation and optimization techniques to analyze this the code that needed to be generated and come up with optimized ways of generating the the proxy code interesting stuff another potential downside and this really comes back to the issue right I added to the slides a second ago having to do with strong typing one of the other potential concerns with proxy is it will overly restrict the type system that you have available on the client side now this is a sort of a subtle point and I want to have some discussion about it see what other people’s insights and impressions are about this so when you start using proxies you’re really saying ahead of time these are the valid types that can be passed to methods with a particular signature and here are the results that can come back your tightly coupling those things now when you write normal Java code you write an almost C++ code you do that all the time so that’s not really an unusual thing to do at one level but when you start moving into a distributed environment or an environment where you’re separating out client and server so that multiple clients are accessing a common shared server all of a sudden things get more interesting and there was a fair amount of backlash a number of years ago against strongly and statically typed proxies in middleware for distributed computing systems and the the alternative approach which you see by the way all the time and things like soap and another message based or message oriented messaging systems XML based messaging systems is stuff where it’s more dynamically typed where you actually create a message you build up a message piece by piece usually using some kind of internal coding where you keep track of the type tag like like a tag and XML along with the data so there self-describing messages and you end up instead of having an interface with lots

of different methods with very strongly type signatures instead you have one interface called send that takes a message and the message can be done in some arbitrary self-describing way so in fact if you take a look at the messengers in android that’s kind of the way it is right you create a messenger and it’s gotta send method it takes a message and you put bundles into the message to put anything you want into there so does anybody want to hazard an opinion about the trade-offs and pros and cons between the proxy like approach which is more statically typed and the message based approach which is more dynamically typed why would you choose one versus the other what are the pros and cons I like the static typing because you get most of your ears in front of the compiler so when you use a statically typed approach you’re letting the compiler and the other tools and the language processing tool chain determine early on when you have inconsistencies between the types that you’re creating on the client side and the things that you’re passing over to the server side so so strong typing removing errors another thing that comes along with that is often optimization the more type information is available usually the more you can optimize things it was strong typing you often don’t have to send over the type information you can basically have the sender and receiver agree ahead of time and so the type information isn’t sent with a message only the data is sent with a message so the amount of data that’s used on the network goes down and the amount of data to be processed goes down and so on so those are those are arguments in favor of the strongly typed approach but there’s another point of view I’ll never forget when i first started programming back in the mid-80s i was reading about fortran programmers and C programmers who didn’t like Pascal and they would always say strong typing is for weak minds okay so so what’s the downside with with the with the proxy based approach with a stronger typing less flexible and so what that really means in this context is as your services evolve over time if you have strong typing then as your service evolves all the clients that rely on that service have to also be evolved to take advantage or to utilize the new type of information and so as a result it becomes quote inflexible because now you’ve got to go out somehow and change all the clients and there could be a lot of clients and a lot of those clients you may not have any control over because they’re belonging to some other organizations some other development team some external group you have no knowledge of whatsoever so in those environments especially if you’re trying to do things like large-scale enterprise system integration it’s often useful to have a model where you keep the messages dynamically typed so as you evolve your services then you can simply have your new clients new do new things add new parts to the message and when they send the service the service will say aha I got a message from a nuke from a new client I know how to process that but the old messages from the old clients will still be handled without requiring changes to the clients so there’s a lot of issues there about trade-offs between forcing changes to the edge of the system which is what you need to do if you have stronger typing through proxies versus having the typing decisions done by the service which gives you more flexibility and dynamism but is typically slower has more time and space overhead and also leads to surprises as Christophe was just saying you often don’t really realize that until the system is running that people are sending you gobbledygook that makes no sense at all so there’s all these interesting trade-offs so that’s a key thing to remember and probably a good quiz question what are the trade-offs between the strong typing and dynamic typing in the context of proxy we’ll see we might or might not see later there there’s some use of this in Android but it’s not big enough to maybe warrant it as a full-blown pattern there’s a pattern called extension interface which is widely used in microsoft com and in some parts of enterprise javabeans where you basically have these things called components and a component is basically an object on steroids that exposes multiple interfaces and each of those interfaces exposed by the component are strongly typed and they have proxies and what you do is you have this little kind of 20 questions negotiation take place between the client and the component where the client goes out and it queries the component says do you support this interface oh you don’t ok do you support this interface ah you do ok thank you and so it takes what it gets back and then it makes method calls so the so called extension interface is kind of this hybridization between static typing you get with proxy and more flexibility that you get with messaging so it kind of gives you the best of both worlds or the worst of both worlds how you look at it another problem you have with with proxy is the fact that it’s very difficult to entirely shield the application client from the fact that there’s a boundary between sender and receiver so you might

like to pretend as though it’s invisible but invariably it’s not and when things go wrong there there errors that occur when you start running things in different address spaces that do not occur if things are collocated them in the same address space if you make a method call on an object in the same address space network unreachable is unlikely to occur right whereas if you’re running in something that’s going across the internet then you know someone might cut the network cable with a backhoe or something like that or the storm might blow down a wireless tower and you lose connectivity or whatever right so these are things that you have to be prepared for when you start programming clients using proxies that you wouldn’t have to be prepared for if you were just making normal calls there’s a number of known uses here most of these known uses are actually much more interesting in the context of the broker pattern but they also use proxies so I’ve just mentioned them here what cough doctrine or later the Sun RPC or the so-called open network computing RPC royal procedure call model which started back in the middle 80s which was a successor to something that came out of a Xerox PARC on remote procedure calls by Andrew Burrell and Bruce Nelson there’s also the distributed computing environment dce which came around the late 80s which was a kind of the the end of the line of very convoluted complicated sea-based RPC mechanisms what came next was the combination of the peanut butter of distribution with the chocolate of objects to form distribute out your computing and you get things like Sun RP Sun remote method invocation or Microsoft’s decom or the common object broker architecture various kinds of optical crest brokers in this context korba and so on a lot of stuff done there they all use proxies very heavily and then nowadays now we’re you know 20 30 years down the road this stuff resurfaces again but this time in the context of communication between processes using the binder mechanisms for local inter process communication on us on the same device in Android and it’s worth also noticing if you click on this link here you’ll be taken to a little stub that describes where the Android binder framework came from it actually came from an earlier work done by vos and I think maybe even palm OS by a group of people led by Diane hack born he’s got the coolest name of programmers of the world born to hack right we actually have a guy in Isis called doug Hackworth so he’s a worthy hacker I guess on and so they basically wrote this binder and then that they got bought or they joined the android development team at some point later and that’s what the the android binder is based on it’s called something called the open biter as i wrap up this section and then will migrate to the next topic i want to make some interesting observations about proxies so proxies are a great microcosm of a trend that’s been going on in computing for decades ever since the very beginning of time and that trend is that useful patterns evolve over time into features provided and supported automatically by various kinds of language processing tools either programming languages or by tools like a IDL compilers here are some examples if you go back far enough in time you’ll see that the early developers of assembly code we’re using certain patterns of writing assembly code and those were things that later found their way into the second generation languages like Fortran and C where they had things like closed subroutines or if else statements or switch statements or loops of various kinds of things those things all started out by assembly language program is saying gosh if we organize our assembly code in these patterns it’ll be easier for us to maintain and smart people like John Baucus who invented the first Fortran compiler said you know we can actually automate that we can build a tool that does that for us it’s kind of funny if you go back in in the early days of compilation and second-generation languages there’s all kinds of controversy between the super duper a set of the language programmers who knew how to write really tight assembly code and register do register allocation by hand and all these new fangled developers who were trying to write compilers to automate that stuff there was a lot of controversy people said you’ll never be able to write a compiler that can generate code you know better than i can write and assembly and so on and so forth which always amazes me and amuses me to no end whenever new technology comes along that’s threatening to the status quo people always start out by by downplaying its relevance and then later on they go home well you know maybe it maybe it solves a few problems but it doesn’t solve all the problems and then at some point they’ve been completely overwhelmed and overtaken and you don’t even think twice you just you just do it without even even stop think about it by the way that meta pattern or mega pattern is described very nicely in a series of books by Clayton Christensen who writes about the innovators dilemma

and the so-called disruptive technologies that come along I won’t go into it in detail right now but it’s worth looking into because it’s great stuff so the point there is that you know what was once patterns then find their way into language features another set of examples if you go back to the survivors describers from like the early 60s you go into the early 70s there was still a lot of people writing code in assembly language a lot of people writing code in C at that time and they were beginning to write bigger systems and they had to figure out some way of hiding information so that their application developers which were using themselves because the program’s weren’t all that big they wouldn’t end up making tight coupling on implementation details that were likely to change over time so if you recall when we when we do 251 we talked about stacks we talked about different ways of hiding the representation of the stack to avoid writing code that relies on a particular implementation so what happened there as people began to come up with patterns for hiding information so like in c you make stuff static for you to find a header file with a bunch of functions and saw it and what grew out of that information hiding approach the patterns of information hiding really turned into the language features for things like classes packages and other ways of being able to automatically and with to automation into automatic checking and language tools be able to make sure that things really do stay private you can’t access stuff you can make access control checks to keep from happening when C++ first started out and for actually at least a decade or so after it started out it didn’t have any support for parameterised types which was uncomfortable and inconvenient at Java was the same way for a long time by the way so back in those days people would follow patterns of preprocessor directives or writing a little tools with SED and awk and so on in order to annotate their code with template like syntax that would then be preprocessed and turned into actual code that experience with macros and these other tools led to people writing templates and adding them to C++ as a template feature there’s lots and lots and lots more examples so people doing things like the iterator pattern that we learn in 251 we’ve talked about here a little bit the iterator pattern is something that used to be done by convention nowadays you take a look at Java you take a look at C++ 11 they’ve baked the iterator pattern into the language with things like range based for loops or the for each loop in Java and so on so once again you see this never-ending trend towards moving stuff from convention into automation and there’s pros and cons with that of course