Monday 25 October 2010

Data Flow Diagrams and Security Requirements

High Level Diagram
Updated: June 1st 2012

Dataflow Diagrams (DFD) can be really useful in helping you specify your security requirements when starting on a new web development project. Although there is a whole literature around the use of DFDs, you don't necessarily have to do anything fancy or complicated. Use simple boxes to show where data will be stored  and lines to indicate where data flows.

Before embarking on the DFDs, classify the type of data that you will be processing. Is it personal data, payment card data etc.?

Draw the DFD diagram at a fairly high level. It should contain boxes to indicate where data will be stored, and then lines between the boxes to indicate the data flows. Include any other important components such as end-users or other devices.

Simple Use Cases or stories will help you work out where data will be stored and where it flows.

An important concept is that of Trust Boundaries. This is where data comes from someplace or goes somewhere which you don't inherently trust. The internet is one obvious place - but it can also be within your own organisation. Data is coming from a different department, and you don't necessarily trust the data that they transmit.

High Level Diagram - with simple DFD
When you have drawn the diagram, then begin to look at security related issues. Threat Modelling is often used to help work out the security risks that you need to address. An issue with formal Threat Modelling is that it can be a relatively complicated technique. It is easy to lose sight of what you are trying to achieve. However, if you or your organisation are familiar and comfortable with the concept of Threat Modelling, then use it.

For the purposes of this exercise I am going to concentrate on a checklist or question based approach.

Data Stores

Your DFD will show where data is stored. For each of these locations ask the following types of questions. The answers will depend to an extent on the Data Classification of the information.
  • Do we need to store the data here? Ideally, data should be stored in as few places as possible. So, for example, if your DFD indicates that data is going to be stored on an external USB stick or a laptop, this should raise a red flag.
  • Do we need (all) this data? In some cases you may be storing excessive amounts of data. Data Protection Legislation requires that you only store as much information as you need.
  • Do we need to take special measures to protect this data?   If you are processing data which falls under Data Protection Legislation then you would need to encrypt any data which is stored on portable media, such as USB sticks or laptops. For payment card information, you would need to protect data in a manner as set down by the PCI DSS standards.
  • Who should have access to the data?  
  • How do you control access to the data?  
  • How do you monitor access to the data?  
  • How long does the data need to be retained for?    
  • What happens to devices at the end of their life cycle?  Devices which store sensitive or confidential information should be securely destroyed at the end of their life. Otherwise they might end up being advertised on ebay.

Data Flows

The next step is to look at data flows using similar types of questions. Of particular importance here is whether data flows across trust boundaries. You should have indicated these in your DFD.

So these are the types of questions for the data flows.
  • Do you need to protect the confidentiality of the data?   If data is flowing across the trust boundary, such as the internet, then you will need to protect the data. Typically you will use technology such as SSL or its more modern variation, TLS.
  • Do you need to protect the integrity of the data?   Maybe we need to check digital signatures on the data to make sure that it has not been tampered with.
  • Do you need to validate the data?   Implement data validation etc., Using the techniques outlined in the OWASP Top 10. If it crosses a trust boundary then data validation should be used.
  • Do we need all this data?   If the data contains excessive information, then delete the excess.
  • How was the source authenticated?   If the information is crossing a trust boundary, then we need some mechanism to confirm the identity of the sender. Typically some sort of logon mechanism is used for authentication.
  • Are there any other measures which are necessary?  For example, to prevent message replay or message deletion.

Document your answers to the above questions. They should help you specify the security requirements that your application will need to meet. For example, now you should have a better idea of where you need to encrypt your data, perform validation, authenticate the user etc.

Some Gotchas

There are a number of areas where you need to be careful. Typically, these are the areas where data may be stored and you haven't thought about it.

Where is your backup stored? If it is stored externally, then you may need to think about mechanisms such as encryption.

Do you plan to use copies of your live production data in your test environment? Remember, that even though it's being used in the test environment, it still is live data and should have the same level of protection applied to it.

Do you write data to application log files or other similar type locations? You should make sure that this functionality is disabled on production systems.

Do you write sensitive data to your audit trail? Make sure not to write sensitive information such as passwords, names and addresses etc to your audit trail.

What about troubleshooting? Situations will arise where you need to extract production data to assist in troubleshooting. This could mean enabling application logging, or copying files to other systems where it can be analysed. You should have procedures  in place to make sure that this information is deleted when the troubleshooting has been completed. Also disable any logging which you may have turned on during the troubleshooting.


Data Flow Diagrams (DFD) are a useful tool in helping you specify the security requirements that your application needs to meet. The approach shown here is relatively simple and informal. In particular, it does not use Threat Modelling techniques to analyse your risks. To look at Threat Modelling in more detail, visit the Microsoft Secure Development Life-cycle (SDL) website which goes in to the technique in more detail. The approach I have shown here is more suited to development teams where there is no or only a limited security program in place.

Social: DiggIt! Reddit Stumble Google Bookmarks Technorati Slashdot


  1. Managing of the big data is not easier then you think it takes some techniques and tricks for the managing of big form of tabular data, so if you don’t know these techniques and trick then you should have to see this for getting the best data scientist for you who will manage you big data for you.

  2. You completed a few fine points there. I did a search on the subject and found nearly all persons will go along with with your blog. freelance web designer peter

  3. Thanks for another wonderful post. Where else could anybody get that type of info in such an ideal way of writing? design agency

  4. Using Dataflow Diagrams (DFD) to specify security requirements in web development projects is a practical approach. By classifying data types, drawing high-level DFD diagrams with storage boxes and data flow lines, and considering trust boundaries, we can effectively visualize data flows and storage. Playing games like monopoly go mod apk download, they provide a refreshing and relaxing way to clear our minds and improve problem-solving abilities, enhancing mental strength.