Refactoring Legacy Code Bases | Hadi Hariri

this is a most thing that I i’ll be showing a some code mostly on site which is the best way to talk about code is on slides not inside Visual Studio and I’m going to talk a little bit about techniques and things that you can do to kind of reflect code bases and why we need to refactor code bases things like that so I uh I always say apply the little two three so basically it means that if you feel you’re in the wrong session or if you feel that everything I’m talking about you already know well just stay put okay you can get up at me I won’t be offended I’d I have a good good way to remember faces though son right can somebody tell me what this is oh oh the other thing it’s gonna be a breeze it’s early it’s eight thirty in the morning you guys are still asleep how many of you have to have coffee how many of you still drinking the beer so we won’t make this too much of a rough session the next one is going to be rough which I’m not giving so what’s this green field which represents what very new program which represents beauty which represents I get to do it absolutely anything I want right how many of you work in this field yeah how many of you work in this field all right the vast majority of us because as soon as we start something it’s like a week gone in and it turns into this and I hope god I wish I could have started this again using something else or a different way so most of our life we’re spending in this kind of place which is kind of stinky and it’s not nice but nonetheless we’ve got to make the best out of it so the question arises of do we need to refactor and why do we need to refactor I’m assuming everybody knows hear what refactoring is right now okay good so we all know what refactoring is so can you tell me why we need to refactor anybody let’s make this interactive good point so you can take other people’s shitty code and make it good you write the best code what else to make it readable what else how many of you refactor okay I would say the main reason for me is to improve understanding of someone else’s code were you at the Dino session last night here who was a dino session last night that was talking about writing code that other people can understand so that’s a very important thing because if you look at our life the amount of time and this is true this is based on scientific evidence okay it’s not but it’s quite accurate the amount of time that we spend writing code versus the amount of time we spend reading code is quite substantial we spend a lot of time reading code now trying to understand what we’re doing what we’re trying to extend what we’re trying to fix it’s a lot of time that we spend reading code and if you take this time the amount of time of what the hell is written here versus oh I understand what’s being done here is also large right there’s again so much time of what was he trying to do you know so much of that and a lot of it is because we have different styles we rewrite in different ways then technologies appear and we decide that it’s great to use these technologies for instance link how many of you know link right everything suddenly started to be written and linked in fact I work for jabbering so we have a tool resharper and we have this convert to link and we thought oh there’s a bloody good idea to you know automate this no it was not resharper when you convert to link produces some of the most horrendously unreadable code ever so we introduced our take link and convert to for expression which wasn’t controls it i mean you know take existing code because it’s really hard sometimes to read this kind of readable unreadable code so that’s one reason to make code clearer to read would you agree with that okay

good another reason what else building new features so why would I refactor to build new features make it easier to build new features how would I make that easier by refactoring simpler design what else but next okay yep fair enough would you say minimizing change all of that is pretty much related to minimizing change no so if I have code and I want to build a new feature what is the biggest issue I have with building a new feature assuming that I know exactly how to build that feature and I have the perfect requirements on an existing code base what is the biggest problem sometimes we face when we want to introduce a new feature it makes it unstable right because how many times have you written code or added features to existing code base and it’s come out beautiful perfect and hasn’t broken anything never right and as the code base grows this becomes bigger it becomes a bigger problem if you look at this graph of a project lifetime everything grows right I mean absolutely everything we have the underst well that actually says understanding but you can’t understand it oh isn’t that ironic that says understanding that goes higher it’s harder to understand which means that it’s harder to change which means that its costs more over the lifetime of the project everything just increases and increases and increases and a lot of that has to do with understanding and also change every time I introduce a change to my code base I run the potential of breaking something especially if I don’t have unit tests how many people unit test right the rest implement ship and pray that’s the other process in especially if I don’t have unit testing I’m a catch it I make some changes i add a feature and that can potentially break something else and I’ve had this happen to me millions of times I’m a really bad programmer fine I’ve had it happen to me a lot in fact there’s one instance which I always uses an example of how badly you can screw up is we used to do this software for print industry which is very big in Germany and in the print industry you have to print out sheets of paper and you do like magazines so you say for example a 64-page magazine you take a big sheet and you divide it up into eight pages and then you have to fold it in a special way print on both sides etc etc and one of the things that you have to do also is put a protective varnish on it after you print it so that the ink doesn’t run right the protective varnish on a sheet of maybe let’s say an a3 or something like that can cost you 0.005 cents right but on a 128-page magazine that you’re doing 10,000 runs adds up to some money so we had this software which was not unit tested and it was very very complex it was an ERP and it would do these calculations and we went and added some new feature and when we added that new feature lo and behold what did it do somehow some if branch didn’t get executed and it stopped calculating varnish costs and after a week we realized right luckily we had only deployed it to my 21 customer who happen to also be my partner so only he was affected by losing something like 25 or 30 thousand euros right no one else was we were lucky in that my partner was a nice guy that didn’t sue me and also we didn’t deploy it to all of the other customers and we had absolutely nothing to do with varnishing what we introduced had absolutely nothing to do with varnishing because you cannot predict the consequences of changing code you do not know what you’re going to affect by changing some code so it is very important to refactor in order to minimize change and one of the key things in code in creating and maintaining legacy code is minimizing change creating systems refactoring systems to create other systems that have minimum impact when we introduce new features or fix bugs so I

would sum up in my opinion at least that the main two reasons that I refactor is to improve understanding which ultimately produces decreases the cost of implementing new features and also minimize change the next question of course that comes up is when do we refactor right many times we say well we have we start with the project or we’ve got a new version of that project and we got to get out these X features and as we’re going along we start to see that there are issues and we say well we can’t we don’t have time to do this now we’ve estimated that this feature is going to take five days and therefore we will just hack around this put a little to do and move on how many of you have done that in the past everyone’s done that right and we say that one day we will create an iteration or a sprint or take time off of ten days and say let’s refactor that is going to be the big refactoring day right how many times does that happen it never happens and when we’re close to it happening we’ll just put it on for tomorrow and I tomorrow a mystical land where ninety-nine percent of all human productivity motivation and achievement is stored so true isn’t it I’ll just delay this for tomorrow and tomorrow never comes so point is that I don’t find it very productive also to say I’m going to dedicate a week or two weeks or three weeks to refactoring because it just doesn’t work just doesn’t work rarely do you have time in a project to say let’s just take a month off and refactor you just can’t especially if you’re doing products especially if you doing products I mean for example I’ll give you the experience of resharper and we shall pass it five we shipped with a lot of new features and then we shop at six i would say around fifty or sixty percent of the time of resharper six was spent refactoring right cleaning up the code base but at the same time we had to introduce new features because we couldn’t just ship resharper 6 and say what’s new well it’s much cleaner inside but you can’t people don’t care even though in really they should care because it’s cleaner inside which means that there’s less possibility of things breaking when we introduce new features or when we fix a park or it will go faster because there was a hell of a lot of increase but you just can’t say what is just cleaner on the outside it’s exactly the same but inside it’s much nicer and here pay me 200 euros you can’t so we had to refactor and introduce new features at the same time so you can’t do that you can’t just take six months off a project and say right now I’m going to sit and clean it and there’s no way in hell you’re going to sell this to management well you see the thing is that is if I do this next time the customer asks for a feature will be much better it just it doesn’t work doesn’t work this is my similar kind of to my kids wardrobe right if you have kids it kind of normally the vast majority of houses look like this and of course what happens I’m actually amazed at them sometimes it’s like they want to play with a toy and they find all the little pieces in all different areas right but it takes them like two hours to find all the pieces and then they have 10 minutes play time and then they lose it all again and I say to them just try and keep things tidy so that next time you know when you’re looking for something you know exactly where to look and it’s much the same way with code if we could keep things tidy for keep things organized we exactly know where to look for things of course the problem is that yeah hide I didn’t make this mess why should I clean it up right which is the problem that we have now let Joe clean it up he wrote it but ultimately if we have a situation like this it’s much better and this raises another problem on many teams I’ve noticed that they kind of one person can be responsible for specific area of feature of a project so what does Joe do well he implements I don’t know CSS support or he does the invoicing section and Joe knows everything about the invoicing section and if there’s a bug Joe will fix the bug and if there’s a new feature Joe will implement that new feature do you suffer that problem on your team’s right

and I say it’s a problem why is it a problem because in the short time it feels very good it feels like a fabulous idea Joe knows everything there is to know about invoices so Joe writes everything about invoices and joe wright’s with his coat style and Joe does everything himself and Joe fixes it until Joe goes on holiday then we’re screwed and if Joe dies it’s the end of the world right you’ve heard that saying of the the bus number or the truck number know is how many members of your team need to be run over by truck for the project to fail I know it’s deep dark humor but nonetheless it’s true right I mean if you have a team of ten and two people get run over by truck or okay let’s be nicer they move to Hawaii and your project is kind of left there like move now what Joe knew everything it’s a problem that’s why we try and do certain things in a way so that we don’t have dependency on a single individual such as for instance automating build processes right no one in their right mind today does a manual build process you automate it not only because you don’t want to have a dependency on one person but also because humans make mistakes machines don’t and a build process is ten steps and you it doesn’t matter the more you do the same steps over and over again the more chances there aren’t that you will fail it’s like driving if all you do is drive up and down the same road you start to gain so much confidence and that’s what you’re going to end up having an accident right it’s the same with build processes it’s that dependency we try and eliminate so what happens is that we have teams that create little silos that are focused on one person implementing one feature of one area and what happens is that side effects of this is that first of all no one else can read that code because no one else reviews it because why should they they’re not focusing on that they’re not fixing anything there they’re not fixing any bugs or implementing new feature so why should they care about that code will let Joe do it and that’s wrong because we don’t want to have a dependency on Joe and not only for our sake or the team’s sake or the project set but for Joe’s sake people get bored working on the same thing year after year after year after year they need to move they need to evolve they need to educate themselves they need to find other things they need to experiment they need to learn from other developers if what you’re doing is working on your area that’s a problem so even if it’s not your mess it should be you should be responsible just like anyone else to clean it up unfortunately this also leads to another problem which is it’s called ego right wait a minute you took my code and you rewrote it like this I’ve had that issue a friend of mine used to say the best thing you can do on a team is when you come to work in the morning leave your ego at the door and pick it up on the way out right because that is a big issue that we have with developers and people you know you do work and someone else comes and says well that’s not that nice let’s just do it this way there is a clash all right and then that exist in any any job any job anything you do right someone fixes a tire on a wheel and someone else comes and says nope now let me just change the way you fix that it’s going to offend the person but we got to learn to communicate and deal with this issue but that’s another talk so before we start anything before we start all of this refactoring and that there’s a few minimum required steps that we need to take first of all there’s this concept of source control which I’m hoping and assuming that everybody does use yes and as I said yesterday visual sourcesafe does not count as source control nor does folders that I had a customer which and i’m not talking about someone like 15 years ago now i’m talking about two years ago their source could not actually this customer is still doing it they’re still doing it and if i told you the name of the customer you would you collapse right here because it’s one of the largest companies worldwide but anyway that they’ve got a little department and they’re doing some internal stuff I said what are you using for source control right oh well we’ve got our own mechanism I’m like what’s that well we take files we zip them we give the zip file a name which is the date time and a

our daytime and then we store it on a shared folder and we have a little text file inside the zip file with comments of what those changes can take I mean I can’t begin to think like how many how could I tell these people that they’re doing it wrong right do you know source control exists here but it’s too complicated for us because this is so much easier right so much more fascinating everybody does use source control yes anybody tell anybody still uses zip files with no good and automation that again automation is very important you need to try and automate as many things as you can in your in your life source control allows on dues and what it also allows you to do is refactor without fear and it also removes comments because i often find it fascinating that people use source control and then i find this in code right I method completely commented out and checked in what is that well I’ve made some changes in this methods no longer required so why is it there well in case I screwed up yep but you just checked in a version why don’t you just roll back oh it’s easier like this okay so now I come across this one already and I’m like why is this commented what what should I do there’s no you know name of who commented it or the reason because that’s an actual that’s a commit message right the source control does that for you so why are you checking in code with comments this all contributes to misunderstandings and decreasing readability what is worse than this is this this is horrendous because I have a method that has some information since I had some code and then this doesn’t line commented out what does this mean don’t I need to set up defaults anymore and I had this situation occurred to me with Objective C we were in a company where we were just starting out doing some iphone development and we made an application for talking to exchange server right and the guy that wrote it his name was not Joe it was Jose which is Joe in Spanish he left one way of putting it he left and so I had to take this over which was on the problem because you know everything was working fine and we were selling it one day I get a email support through the shop and reviews that this doesn’t work I’m like oh ok so i fire up apple and i fire up the mac and I’m like hmm where do I look so what’s the problem it doesn’t connect it just doesn’t sink like okay and I start to look through a code and I kid you not there’s a line to a method called called Kinect that is commented out I’m like this can’t be I mean how can people be using this with this version this is not a new version it’s the exact version is shipped and this line commented out disconnect it can’t be so of course what was my natural reaction I uncommented the line built it shipped it waited three weeks for the update on the App Store no I’ve still doesn’t work then after about two months we figured out that because the other problem is that you know people aren’t immediate specially people that use an iPhone app that they paid fifty cents for they don’t get back to you immediately we found out that the problem wasn’t that line connect it was that the guy was using a newer version of exchange server and the protocol has changed so it was just failing to connect right now yes I admit that I’m not the brightest developer in the world but when someone says to you it doesn’t sink and you open up the code base and the first thing you see is a line that says connect and it’s commented out it just can’t set you down the wrong path right so what is the purpose of leaving that line in there and obviously I did drill into the method and it did say setting the port and all that and it was connecting and all that right anyway this is better I’ve removed set up because it’s causing an issue on second consecutive calls right but for that again we’ve got source control just remove that code and check it in as a comment to source control because this is even better now I set a date I’m author but again you can do this all in source control right and then we figure the problem is that we do these things and then in source control in the commit message is we like fix some stuff right

that’s that’s that’s our message fix some stuff have you guys seen the daily commits com who’s seen the daily commit there’s a website that’s daily commit and what it does it pulls the daily commit messages from github that have obscenity in the commit messages and you should read through some of them and so much frustration and anger in us as developers so automation one of the issues again is all this is too hard to automate so that’s why we’ll just do it manually then offset icon there’s no other words to say about that okay so now let’s get to actually refactoring and the first step that’s important is impact analysis so before I refactor I need to know what it is that I’m going to break because that’s what happens when we refactor we break things especially if we don’t have tests we end up breaking things and that is very important that is another good reason to have tests write tests go from writing our code to becoming a unit test verification and then as soon as they become the unit test and it passes it just immediately changes life form and it becomes regression test making sure that any other changes that unit test is there so it’s got three purposes in life and still half of the room doesn’t unit test by the way why don’t you in a test those that don’t unit test why okay I’ll do this another way how many of you unit test how many of you think unit testing is good right it’s always the same it’s like how many of you know smoking is bad how many of you smoke would you would it be fair to say that have you ever said the heart I don’t die that I don’t get paid to write unit tests now I don’t get paid to write if statements either but I write them okay my turn the flight the guy that goes on a plane it’s like I play I paid for this flight I didn’t pay for you to test this stuff right impact analysis helpers or how I can determine what things are going to change there’s a bunch of tools one of them is n depends i mean even you know of end append that’s a very good tool because what it does is basically analyzes your code and it tells you what module has dependency wat module and if you get a graph that has a lot of wires going back and forth that’s that’s kind of like a no-no that’s a little bit of an issue so you want to kind of isolate things as much as you can another is a my favorite tool of all time I don’t know if you’ve heard of it cold with sharper and partner it’s called resharper there’s alternatives out there as well you can use any of them just don’t use Find and Replace in visual studio and the fine result resharper works with fine results so you can search for implementations of a specific type you can search for all of the different like for example if you’re looking for an interface or you’re looking for a class and you want to see how many implementations of this there are because as soon as you change an interface to that class or whatever it can impact it’s very good in finding all these aspects of your code and it’s things you need to look at before refactoring there is the refactoring with our tests issue though how do I refactor if I don’t have tests because that is a problem because if i change something and I don’t have a test how do I know if what I’ve changed works or not and that is a fear that we have and that’s why we don’t touch things and the reason often we all have test is because it’s too hard to isolate we say well I can’t test this it’s too hard to isolate but here’s a fact for you an integration test is better than no test if you cannot isolate something just write an integration test for it and replace some of those integration tests to make it faster one of the main reasons do you know the difference between a unit test and integration test unit test I ideally tests one thing in isolation so that when i run that test and something fails i know what’s failed what that thing that i’m testing and it’s all so fast integration tests tests integrated components of the system right so if something fails something of those have failed and I’ve got to figure out what it is and also it goes slower but having no test is not better than running an integration test especially when you’re doing refactoring so while in the beginning of my career in terms of unit testing I was like saying okay I have to do unit testing of every single module and then I’ve got to write integration test of all combinations of

modules together and i end up with seven thousand units has four five lines of code now I’ve moved more towards the maybe I can just not use so much mark objects and maybe I can just use in-memory databases maybe I can you replace some mock file so maybe i can replace some HTTP calls with my own dummy class and stuff like that and focus more on integration tests and have them run and just speed things up if I speed things up it’s it’s okay it’s a compromise because then what happens is that the only thing that I lose right versus unit testing is if something breaks in that test I’ve got to figure out what it is versus in a unit test normally its immediate right but a little stack trace analysis doesn’t harm anyone right so move a little bit more towards integration test and if you can just isolate some things to make it faster focus on making the integration test faster but have integration tests because they really are beneficial especially when you’re refactoring especially if you have no test at all I also you can use code coverage for non happy paths what does that mean code coverage is normally focused around oh I’ve got unit tests so I’m going to run code coverage on these unit test to see what percentage of my code is covered by tests and if it hits seventy percent I’m doing good right because the exact percentage of code coverage which is valuable is actually scientifically again proven that is seventy three point eight percent if you hit that mark you’re good if you’re 73 but I’m bullshitting you okay good there is no the code coverage has never been about percentages and has never been about seeing how much of my code bases covered I’ve always looked a code coverage as a means to find out if there’s something that I have not covered has a potential problem in other words when I run code coverage and I see forty percent no coverage I’ll look through that forty percent and see I’m okay well I don’t care about this line of code maybe I don’t need to cover it so if you look at code coverage and this actually looks really nice on my screen it doesn’t look nice there because this lines here are green they’ve got a green background these lines here are pink which you can barely make out so anything that’s white here because you can’t actually see the green can you know so these ones actually have a green background those ones have a green background this one has a red right and this isn’t unit testing this is I’ve used a tool that I’ve run coverage I’ve run my application I’ve gone through a series of scenarios and it’s given me coverage and what is saying to me is that these lines here are not have been hit when I’ve done that scenario these lines have not been hit so without unit testing I can still cover my code and see if I’ve refactored and I thought that when I refactored this should have been hit how come it hasn’t been hit right and it’s an ideal way to find bugs just running code coverage on your code base not even your tests and seeing why isn’t it not being hit I found so many bugs like that like for example one time I wrote this sophisticated doll that had a rollback mechanism of several layers and shipped application version it was working everything and then one day I realized that rollback was never even ever ever ever being cold how is that possible it was working but I found a potential buck detecting code smells because when we refactor we also got to see what things we want to take out what are the problems that we want to focus on if I’ve got a really large code base I want to see where what issues I want to improve so we got to detect those code smells and what our code smells well first of all the less-is-more or also known as single responsibility how many of you know single responsibility right who can tell me what it means go ahead now should have now but tell me the principal right class should do only one thing class should only be responsible for a single thing okay what is that thing that’s the key question of single responsibility principle right where do I draw the line on that thing I’ve got a customer class and it creates customers into these customers it writes invoices is sense customer reports it does everything to do with customers it does one thing customers that’s all it does it’s not a valid single responsibility principle no so what he was saying is

well a class should i only have one reason to change and we’ll get to that right but before we get to that symptoms you’re breaking signal responsibility now I don’t have that actually working but this is an example this was a video that I recorded because I couldn’t fit it on the screen it was a class with like seven thousand lines of code right and it was just recording through the video and it’s actually from an application called ERP store that’s on codeplex you can look through some of the classes although the guy after he saw that I’ve been talking about his code a little bit he he has improved it but you know it’s good because it’s constructive criticism right this was a very very long class and it did absolutely everything sometimes you’re breaking song no responsibility principle is if you use a plug-in called rock scroll right how many of you have heard of rock scroll rock scroll and sublime text has this feature to as well is basically a plug-in you have for visual studio that here hanselman even wrote about in a nice blog post where you can this is a preview of your file right no no I don’t need to have a preview of my fault because that’s indicating that I’ve got way too much in my file way too much and people say rock scroll doesn’t work with resharper and I’m like who cares you don’t need rock scroll rock scroll is putting a patch on a problem it’s not fixing the problem you’re just using raw scroll to have a preview of of your file because it’s so large it doesn’t fit on the 3860 pixel screen that you now have right regions that’s another symptom of breaking single responsibility principle often right do away with regions they’re evil there’s absolutely no reason in life to have regions nothing except if you want to hold license information at the top absolutely nothing this nonsense of oh I’m going to decorate my interfaces and my properties and my member come on the last thing I want to do when I look through code bases click that plus sign and often regions aren’t used to hide this information but they’re used to hide other information that doesn’t belong in that class so let’s solve a single responsibility principle and the first one is reason for change take a look at your class and see how many reasons are there to change that class and the more reasons the more problems you’re going to have because if a class changes for various reasons that means that if I make a change it can potentially impact that class because our class can change for multiple reasons if the class only changes for a single reason when will I know if that class will change or when will it be impacted when I change it for that reason are you following me or too many reasons and changes am I saying right the least less number of reasons the class has to change the less probability there is that it will be impacted by change you make in the system it’s not simple that is the only reason that we use single responsibility that and to make it easier to understand if I look at a class and it does 100 things it’s harder to read it’s harder to maintain it’s much easier to have two classes that due to different things first is one class that does multiple things it also allows for better modularity it also allows for better reuse of code so the reason to change then there’s naming okay and I used to make a big mistake with naming I mean not as big as these people calling their kid Apple but I used to say oh I’m going to create a class i’m going to call it a and i’ll give it a name later how many of you have done that or just giving it some name and said i’ll just change it later right you’ve not done that good on you if you look at a class and you look at the methods what you want to try and see is how cohesive those methods are okay how many of you have classes that are called blah blah blah service blah blah blah utils blah blah blah manager god there’s only one kind wow what do you name your class is anybody ever used service in their code base manager utils help her ah why you naming things helpers because all the other classes are just a holes and the only ones that help actually is the helpers and then we get microsoft shipping that nonsense and we’re like oh if Microsoft does it should be okay

right why we’re naming things helpers I like to I now come to the name in the second foot look at the cohesion stuff right there’s a feature in resharper which is make method static so basically what it does it looks through your class and says well I have you have a method that does not use any instance variables do you want to make this method static I normally don’t make method static but I do use this in a different way I use this as a sign of a smell that maybe that method does not belong in that class so how cohesive is your class do an experiment take your class take all of the parameters that you have in your methods and make them instance fields right so instead of using parameters in the functions just have them as instant fields and have the functions share these instant fields you know everything that’s bad like maintaining state do that now look at your class and see how many of those instance fields are used by more than one method if I have ten instance fields and 20 methods and 15 of those methods use three instance fields and the other seven are unused you start to see how much cohesion there is between the different methods in your class it’s just an experiment I’m not saying that you should stop using parameters and move to instance fields but it’s about how much how cohesive your methods and your instance fields are together how much information do they actually share right so when you have this make method make a method static it’s saying that maybe this method does not belong in this class right then it doesn’t really use anything of this class maybe it doesn’t belong in that class then this emergence through naming it’s trying to find responsibilities and trying to isolate code trying to back to your classes to have a single responsibility by naming them accordingly and when you use things like manager and util and helpers what goes is a tendency of putting everything in there it’s a helper class i’ll put everything in there it’s a YouTube class i’ll put everything in there but there’s a problem with that that we don’t really appreciate you know that you have a method that is a helper that is enough class called string helpers I don’t know that and when I come onto the project and the project is 60,000 classes and I want to add a little helper method of my own that does something similar and I know that you’re naming is not too good what am I going to do I’m going to create my own helper class and i’m going to create my own method that that’s exactly the same because i don’t name things properly because i just add a suffix of helper and just hope for the best why don’t we name things properly why don’t we say you know what does this class do calculate estimation approval or call it calculate estimation approval now we’ll call it calculate estimation approval service and then we’ll have a method called execute apply single responsibility on methods far too often I see people not doing that at all they just don’t apply srp our methods it’s just applied to classes they think it’s only available to classes so for example I have a method like this do something and do something else and that is if what to do do something else do something else why is there a reason to have that if you have and in your method name it’s an indication that you’re doing too much in that method name now there are rules that says Oh a method should only have three four lines of code oh yeah if you have a hundred lines of code try separating those two methods up into so using boolean using any type of parameter value to use as a conditional to decide what to do is normally a code smell that I could just do two functions and name them appropriately what they do so refactor these things up a tool that uses that is again n depend it’s very good for that what it allows you to do is look for methods that match certain criteria like cyclomatic complexity like number of parameters that are being passed in these things you can find easily with and depend interface segregation that’s another good example of things that we need to refactor out this is a simple interface forward reverse left turn right turn take off and land right and if i were to implement a car that’s what i would find now no one in their right mind would

write code like that good day no well yes they do actually if you’ve ever implemented a membership provider in asp.net that’s what you come across I need two functions I need my own membership provider for this okay implement a membership provider and just to implement these two functions do you realize that what we’re doing is completely going against everything that the principles are telling us to do in essence an interface is what it’s a contract and we’re saying this interface has these methods which means it abides by these contracts oh yes but we only use two of them mine I’ll rent you my house but yeah I don’t worry about the seven eight nine and ten close we’ll just stick to the first four if an interface is a contract if I’m reading this code and I co a car and I vehicle Oh take off now this is not implemented you could be nice and say not needed I don’t need this method or you can be really nasty and just delete the code here since I’m not returning value it’s fine I’ll just delete the code and I’ll leave there a method blank and then two years go by and someone else is looking in the code and they’re like this doesn’t work why is this method blank maybe I should implement this method where is the documentation there is no documentation where is the test there is no test so why am i leaving methods blank and the problem with this is that we need to refactor out these kinds of interfaces and that’s what the interface segregation principle tells us to do is make smaller interfaces more concrete on what exactly those are doing basically it’s applying simolit single responsibility principle to interfaces I now have a reversing vehicle and I have a flying vehicle and I have a land vehicle and then when I implement a car well then I’ll do an implement Island vehicle and I reversing vehicle and when i implement an airplane i will do an island vehicle and I flying vehicle right so you need to refactor these things out because if i look at my code look at your code base and I see interfaces with 30 methods it’s harder for me to know what I have to do and what I don’t have to do it’s harder to name those interfaces what how can something do 30 things often very hard and the problem is that this leads to other issues you know I have a talk next week which is modularity it’s about how to create like little pieces of code that you can reuse and not have to reinvent the wheel each time and a lot of times the reasons behind this isn’t some secret but it’s how we design applications which we create these larger classes larger interfaces and then when we want to reuse them we say well this isn’t quite the way I needed it so i’ll just create a duplicate or i’ll just change it in this way and if we abide by these kind of principles we do single smaller things that we can classify as modules that we can reuse dealing with dependencies so i have a class than that class talks to a database or talks to a socket server and i need to refactor it and now i I’ve run into an issue because I don’t have a database I don’t have a socket I don’t have a unit test I need to refactor that but I don’t have a unit test what do i do I couldn’t MIT yeah so how many of you are familiar with dependency injection okay so the first thing that you can do is just as parameters or whatever if you’re making them ethical or whatever is just pass in the null object and hope everything is it is good and you know nothing breaks the next thing is extract interfaces and just you know those into dependencies that are being passed in extract them out and then provide interfaces so that you can replace them and if all that fails you can use type mark or moles or something that allows you to override behavior but when dealing with legacy code how do we do this dependency injection do we do dependency injection or do we do service locator do you know the difference between the two so dependency injection is basically if i have a class let’s say I have a I’ll call it a customer management that’s a good name customer management that talks to a database so that customer management talks with database so therefore i have my database access abstracted in another class so the customer management now in some method instantiates an instance of the database class and talks to my database that’s a

no-no right that’s a big no-no because now it’s very hard to test that it’s very hard to change the database because now we’d have to touch that code and change the code and instantiate a different class etc so the way we solve that is via dependency injection which basically means instead of the customer manager class creating an instance of the custom adult class the customer data access class we pass it in as a parameter to the constructor so I can change the and better than passing in a specific class I can pass in an interface so I can now have different implementations that is dependency injection basically it’s just passing parameters to a constructor all the dependencies of that class needs back in the old days instead of that or still instead of passing in dependencies what we can also do is just create what’s called a service locator which instead of my customer manager class creating an instance of the customer data access class i delegate that to a factory and I say create data access and then I can configure that factory somewhere else in the application I don’t favor service location personally I prefer to use dependency injection and although it’s easier a little bit to say yes in a in a legacy code service location is easier to implement why can someone tell me well because we service location I don’t have to change the interfaces of the classes I don’t have to add parameters two constructors so therefore I break listings as soon as I use dependency injection I’ve got to change the parameter of a constructor which means now half of my project won’t compile right so in essence yes it’s easier to do but in the long term is going to lead you to more problems because eventually you got to get rid of service location and dependency injection hose is good in that not only does it allow you to isolate your dependencies but it also gives you code smells right here is an example of order controller from real code which is the example I was talking to you about before and if you look at this and you see one two three four five six seven dependencies being passed into that class and I’ve seen this way too many times but it’s okay i’m using dependency injection surely this is fine right because I’m using it I’m doing everything the book says I’m passing in my dependencies yes what you’re passing in seven dependencies what does that mean that means that the class that is using these dependencies has tight coupling to every other class that is being passed in sure it’s not going to use a specific type of encryption but nonetheless it depends on the encryption service and if you have this types of graphs throughout your application you’re not accomplishing much you’re using dependency injection you’re configuring it using an ioc container but at the end of the day you got tight coupling because everything depends on everything so when i use dependency injection as opposed to a service location it also provides me as a code smell because if i see that a constructor is taking in more than two three dependencies that’s a sign that this this class is probably doing too much so when you are if i say basically what i’m saying here is that i recommend in reef in legacy code bases to start with dependency injection as opposed to service location the problem is of course that i start to break too much of my code and what you can do there is use what’s called poor man’s dependency which basically is create a parameterless constructor that creates instances of your class and one that you can pass in dependencies and then gradually gradually remove those parameterless constructions that also brings us to the question of ioc now ioc containers basically what they do is configure all these graphs for you so that you can just do this you can do this and magically the ioc container is going to fix it all up for you fantastic recently there’s been talk that oh ioc containers are bad because they allow this kind of behavior no this is bad developers allowing this kind of behavior ioc do their job it doesn’t mean that you have to do it badly so i am all in favor of using ioc and all of this is to minimize change and impact here’s another one of minimizing change and impact this this kind of code we need to refactor out that what does this do based on state it does different things and of course the problem is that i’ve got this type of code repeated in several places in my code base right

this type of code we can refactor out can anyone tell me what we can replace this with if I’m in this state do this if I’m on this state do that if I’m on this state do that what can we replace that with it’s a pattern drying up things here’s a simple example that I like to refactor out this is an asp.net MVC application right and I often see this if employee equals not now return view I’ll read our egg not found I see this many many times so you look at coding you constantly see repetitive code at the beginning of a method the problem with that is that you won’t solve that using something like resharper templates you shouldn’t use resharper templates to generate boilerplate code that is constantly repeating that’s the wrong thing to use right that is solved using a different problem that is solved using aspect-oriented programming are you familiar with that right so basically put it a simple way attributes everyone knows attributes right you can use attributes to solve that problem in asp.net MVC basically I’m saying that I’m going to use aop if you’re not familiar with aop what is AOP every class has a single responsibility okay customer does one thing employee that’s one thing invoicing does another thing etc however there are certain aspects of a class that are cross-cutting there are certain things that all classes have like security like checkings logging etc those aspects you have to extract and use aop I’ll talked about this in a second but first step is to find this kind of code is to find repetitive code and again you can use tools resharper allows you to do that I think other tools allow you to do that as well which is searching for code that is very similar right I think even visual studio has now some functionality which is allowed to placate code okay in resharper we’ve got even a better one which is pattern which basically allows you to define your own pattern in your own conditions and then search globally for those patterns detecting code smells that’s called the custom patterns or search or in place so I can say look for this kind of code and it’ll tell me oh you’ve got this type of code in several places and notice that here it doesn’t care about name of the variable it just detects the type of pattern and then you can replace that and when it comes to replacing this kind of functionality is dealing with repetitive code you ask a series of questions what functionality am I replacing what is it doing so I have to first extract our functionality then I have to evaluate the responsibility of that functionality is it a method yes so I can just create a method in in that class and have that all of the other methods call that method fine we’ve all done that extract method refactor to extract method however if I find that that kind of code is being repeated across multiple classes then i’ll create a helper now I’m kidding the i’ll create another class i won’t name it helper i’ll create another class that does exactly that but the question now is how do i pass this class over how do I have all of my classes use this class and there you have multiple options again you have dependency injection right I can use dependency injection you can use tools like LP frameworks or aspect-oriented programming right which allow you to basically inject code into other classes so that those classes can use them without passing them in as dependencies in.net aop can be accomplished using things like attributes or you can use a post compile a lot of tools that we’ve in functionality to a class for example if you heard of postsharp right post sharp allows you to do something like that you can also use a 0 p with ioc containers how many of you use an ioc container like what do you structure map yep so structure map allows you to intercept functionality by passing in other classes and you can say when a method is called I want you to first intercept it and this other method so you can use that as well if you’re using I occ ioc’s have more benefits than just wiring up things other things that you have to deal with an refactor is composition over inheritance that’s a big problem right especially when we have these kind of what you call it a long inheritance chains how much time do I have left I still got like 10 minutes right yeah

well good good so if I have composition over inheritance I mean sorry if I have a bunch of classes and then one class inherits from another and then another and then another know now that you have this long hair are key of classes and what often happens is that you end up with like the third of fourth class in the chain having a series of methods that it’s inheriting that it doesn’t need it doesn’t require it doesn’t use but because it’s somewhere in the chain define that class is getting it and again this leads to readability problems this reads to understanding why do I have these methods all their inherited why do I have them well because we kind of needed sixty percent of the functionality and then we got this forty percent added as a bonus which we don’t use but all that again leads to single responsibility all that leads to interface segregation etc and in essence one way to avoid that and one way to avoid all the problems that you’ll encounter is to try and favor composition over inheritance because that way you can pass in dependencies as well so instead of having one class inherit from another class to use a functionality just pass in that dependency parsing that functionality as opposed to creating inheritance chains one other way that allows you to lower the side effects of code and again something to use when refactoring is command and query separation and I’m guessing that you’ve heard now it’s very big in the space the CQ RS which is command and query responsibility separation this is the same thing with this this goes back to all the old days it’s been around for many years principal has been defined it was defined by mayor for what was an Englishman was it was the language to define he defined this in it wasn’t small talk it was a Carmen anyway basically what this says is that we need to have methods that either do something which means do something is all to the state of a class and methods that return values so we separate commands commands are methods that alter the state of the class from queries that return values what this allows us is Edom potent calls so that means that it doesn’t matter how many times I call a query it should never change the state of a class which means that I minimize side effects right typical example of something breaking this is returning an enumerator and every time you call a numerator it returns the value and incremented by one that breaks this principle because if i call the numerator twice the value of that numerator has changed on each call right and the idea here is that if i have commands separated from queries if i have functions that do things separated from functions that return values i know exactly when I look a code base anything that returns a value will never have a side effect of changing the state of an object and this same principles is now being applied at a architectural I want to architectural level but at a higher level in separating reads from rights which is essentially what it is so seek urs is applying the same principle in data access right i have from one side I separate my I have my rights which are commands that update my database and from the other side I have queries that read from the database which then leads to why do I need an ORM then no longer necessary but it’s the same principle separating these two to make it clearer and more understandable of exactly the impact that I’m going to have and also obviously it also abides with the applying single responsibility two methods which is just do one single thing okay when you’re introducing a new feature do tdd it or don’t you TDD it now how many of you use TDD here one guy so the test driven development if you’re not aware i mean you all know what test-driven development is yes how many of you know it everyone what is it can someone tell me apart from you okay what else who else who agrees with that okay so do you consider the test what you’re

writing defining the features right so TDD is more about defining specification more about defining what you actually the functionality it’s about using the functionality before writing the functionality that’s in essence what it is right that’s what I’m doing i’m setting up a harness and isolated harness in which i can simulate using something before actually writing it why because how many times have you written a method and then you’ve come to use it and you realize all the way i’ve called it all the way the parameters are passed in etc aren’t appropriate i should rewrite it has that happened to you ever yeah so that’s what TDD is about kind of it’s about saying here use it and kind of see if that’s the way you want it and then write the functionality you so you define your shape you define the requirements of the of the system of the modules etc so obviously the question arises is I’ve got a code base that hardly has any unit test or has some unit test do i use TDD to drive this or not well I always look at EDD like an insurance policy right because forget the fact that you’re doing kind of you know design or whatever I often find that the teams that or even myself that when I say well i’ll tell you what i’ll just write some code and then i’ll write some unit tests afterwards those units has hardly ever come hardly ever come so it’s like you know i have no insurance when I now go and refactor I have no insurance whereas with TDD I first write the test which means that now I have some kind of guarantee that as I refactor I have a test in place and I don’t go back to it later and never write that test which is a problem so if you can kind of push it out with TDD it’s better also TDD leads you down better design paths because when you’re writing a test and you want to test something and you try to set it up and you start to see that there’s so many dependencies is so much required to set it up that’s where you start to think about how should I actually write this code so that it doesn’t require so much effort from my part to set it up so it does lead to a little bit better design there’s a whole bunch of other things I mean I can’t cover everything here there is a law of liskov which I recommend that you look at the law of Demeter which is when you have these a should talk to be and then be cause a method of C&S equals method of D you know only talk to the privates of your don’t touch the private parts of your friends something like that and the principle of least surprised and this is a this is a one that I often find that breaks there’s an example of this in asp.net MVC na in the HTML helpers which is the model valid model validation so there’s there’s a method whereby I pass in a class right I pass in a class which has a bunch of properties and I call validate and what this method does is basically print out when it renders an HTML page it prints out the input for the field of that class it prints out the span and it prints out some CSS for write a title now it prints out the span some validation code and an input box so I pass in a class and for each property of that class it prints out three elements okay why because what it does it’s a helper to do validation so that if I submit some information hit submit it says no these fields are required if you pass in and that takes an object which means it can take any class or it can take up just a single property I pass in a name and I expect what what would you expect if you pass in name that it does the same thing but only for name right that it generates an input field a span tag and some validation code for just the single property know if you pass a name it only does the input field and that is breaking the principle of least surprise which is don’t change the behavior of a method based on the parameters because there’s a developer when I look at something and i have now if you pass in an object it does this if you do that it does that if you do that it does that it completely breaks the principal early surprise which makes it hard for me to determine what exactly is going on welcome to the world of javascript javascript developers love to do that they’re just like yeah here’s a function no parameters you just dump in parameters and based on the parameters you pass and I’ll figure it out and do whatever I want okay that’s fine it’s wonderful that it doesn’t matter how I passing parameters you figure it out but how the hell am I meant to know what I’m in to pass in how do I know how the function behaves based on what I pass in we go to great lengths and efforts to constantly say that you know we shouldn’t write

documentation we should uncomment codes code should be self documenting and then we’re go and write these obscure methods that you don’t know what it’s doing so don’t break that recommended reading this is a I have not touched a lot of the topics that Michael feathers touches on working effectively with legacy code this gives you concrete examples of refactoring but it more or less is in line with what we’ve talked about but well with concrete examples so you know we always are going to end up in this situation always even when we start a greenfield project we’re always going to end up in in a somewhere like this right the only purpose here is to make it look a little bit cleaner more tidy and I asked how to prevent from happening again how do we prevent it because now we’ll go with the best intentions and then we will do it again and again and again I found this quote from this blog post which I really liked and it says when adding a feature that gold the gold isn’t getting from A to B in the simplest way the goal is to make the code that bead look as simple as possible you’re not done when a feature when all your tests pass you’re done when all your tests pass and the code looks as if it wasn’t designed from the beginning with your new feature in mind and here’s the key right you’re not done with a feature when all your tests pass you’re done when all your tests pass and and the important part the code looks as if it was designed from the beginning with your new feature in mind and that’s an important thing to think about if every time I design a new feature I make my system look as if I had thought about this from the beginning it means that I have very few hacks and things added to my code so it’s good to think about when you’re adding new features or refractory and with that I am done and I think we have a 43 seconds for questions nope ok well then thank you