Thursday, 20 June 2013

Visual Studio Tricks - Code Clone Detection

It seems that the tools I take for granted are not used by all. This is the first in a series of posts where I will be sharing some of my essential Visual Studio tools that every .NET developer should be aware of. If you already are then avoid me stating the obvious and go read something else.

Call me Code Review Girl and hand me a cape. In my role, it is common for me to travel the depth and breadth of Asia, Australia and New Zealand to conduct a code review and design divergence checks on very large government and commercial code bases.

Microsoft has internal tools that let us do that with great ease. This is always where we start. However, we often go further with remediation plans and sit with developers to show them the kind of things we found in their code that need improving. We don't just show them how to fix it but explain why these issues are issues and how they can be tracked down. In my case, I use pair programming to start. Teach a dev to fish and all that.

When I pair program with a developer, I will never let them copy and paste code. They can either type the whole thing from scratch or they can write a method and call it in both places. Copy and pasting code is an anti-pattern that will result in bloated code bases and often carried errors.

Now, that is all well and good when you are writing new code with another developer but for the times when you aren't watching and enforcing that rigour or for past crimes, Visual Studio 2012 gives you Code Clone Detection.

I won't go in to too much detail repeating what the link above explains but I will give you a quick summary of what you get and how to use it.

Code Clone Detection looks at code statements of 10 or more lines that appear in methods and properties across entire projects or solutions. It works on code with up to 40% of its tokens changed and with statements that have been rearranged.

There are two ways to use it:
  1. Highlight a specific piece of code, right-click and choose to Find Matching Clones in Solution; and find that specific code repeated; or
  2. Use the Analyze menu to Analyze Solution for Code Clones and find all occurrences of repeated code.
There are ways to exclude certain files like generated code and you will want to do that with large generated code bases.

I find this one of the most useful tools when looking at how to attack a large code base that needs a refactoring machete. This is a tech lead's best friend. Or at least one of them.

Let the clone wars begin.


Sherry said...

Nice and informative read for me. I have not used this before. Rather do it manually by going through chunks and then refactoring them into a single method, helper or the like..

Damana Madden said...

This certainly does not discourage you from manual refactoring. It is a way of indicating where you should focus.

Sherry said...

true. infact manual refactoring has two advantages which i can mention right away.
1. you swim through the system every now and then which helps you know every nook of it.
2. if you are doing it as a routine, next time you code, you code it better and more precise.