|High Level Diagram|
Dataflow Diagrams (DFD) can be really useful in helping you specify your security requirements when starting on a new web development project. Although there is a whole literature around the use of DFDs, you don't necessarily have to do anything fancy or complicated. Use simple boxes to show where data will be stored and lines to indicate where data flows.
Before embarking on the DFDs, classify the type of data that you will be processing. Is it personal data, payment card data etc.?
Draw the DFD diagram at a fairly high level. It should contain boxes to indicate where data will be stored, and then lines between the boxes to indicate the data flows. Include any other important components such as end-users or other devices.
Simple Use Cases or stories will help you work out where data will be stored and where it flows.
|High Level Diagram - with simple DFD|
For the purposes of this exercise I am going to concentrate on a checklist or question based approach.
Data StoresYour DFD will show where data is stored. For each of these locations ask the following types of questions. The answers will depend to an extent on the Data Classification of the information.
- Do we need to store the data here? Ideally, data should be stored in as few places as possible. So, for example, if your DFD indicates that data is going to be stored on an external USB stick or a laptop, this should raise a red flag.
- Do we need (all) this data? In some cases you may be storing excessive amounts of data. Data Protection Legislation requires that you only store as much information as you need.
- Do we need to take special measures to protect this data? If you are processing data which falls under Data Protection Legislation then you would need to encrypt any data which is stored on portable media, such as USB sticks or laptops. For payment card information, you would need to protect data in a manner as set down by the PCI DSS standards.
- Who should have access to the data?
- How do you control access to the data?
- How do you monitor access to the data?
- How long does the data need to be retained for?
- What happens to devices at the end of their life cycle? Devices which store sensitive or confidential information should be securely destroyed at the end of their life. Otherwise they might end up being advertised on ebay.
Data FlowsThe next step is to look at data flows using similar types of questions. Of particular importance here is whether data flows across trust boundaries. You should have indicated these in your DFD.
So these are the types of questions for the data flows.
- Do you need to protect the confidentiality of the data? If data is flowing across the trust boundary, such as the internet, then you will need to protect the data. Typically you will use technology such as SSL or its more modern variation, TLS.
- Do you need to protect the integrity of the data? Maybe we need to check digital signatures on the data to make sure that it has not been tampered with.
- Do you need to validate the data? Implement data validation etc., Using the techniques outlined in the OWASP Top 10. If it crosses a trust boundary then data validation should be used.
- Do we need all this data? If the data contains excessive information, then delete the excess.
- How was the source authenticated? If the information is crossing a trust boundary, then we need some mechanism to confirm the identity of the sender. Typically some sort of logon mechanism is used for authentication.
- Are there any other measures which are necessary? For example, to prevent message replay or message deletion.
Document your answers to the above questions. They should help you specify the security requirements that your application will need to meet. For example, now you should have a better idea of where you need to encrypt your data, perform validation, authenticate the user etc.
Some GotchasThere are a number of areas where you need to be careful. Typically, these are the areas where data may be stored and you haven't thought about it.
Where is your backup stored? If it is stored externally, then you may need to think about mechanisms such as encryption.
Do you plan to use copies of your live production data in your test environment? Remember, that even though it's being used in the test environment, it still is live data and should have the same level of protection applied to it.
Do you write data to application log files or other similar type locations? You should make sure that this functionality is disabled on production systems.
Do you write sensitive data to your audit trail? Make sure not to write sensitive information such as passwords, names and addresses etc to your audit trail.
What about troubleshooting? Situations will arise where you need to extract production data to assist in troubleshooting. This could mean enabling application logging, or copying files to other systems where it can be analysed. You should have procedures in place to make sure that this information is deleted when the troubleshooting has been completed. Also disable any logging which you may have turned on during the troubleshooting.