Transforms - Remove dedup by removing date time from event summary

It seems Windows developers love to put time/date fields into the most SPAMMY transforms, (IIS web app errors, Biztalk etc)...

With the default de-duplication algorithm you will sometimes get thousands of individual messages in your database. Obviously you may want to do something more sophisticated to remove information and construct an appropriate dedupid, remove fields via variables, etc etc.

If you want to just quickly remove date fields from the event summary to enable an event to deduplicate, you can use a variant of this transform.

Personally, I'd rather not spend hours writing complex regexs...Here is a library I use when I have to do something with dates/times:

# this transform will remove dates and time# in the format yyyy-mm-dd hh:mm:ss
# it replaces the date field in the event summary with# the text 'DATETIMEREMOVED'
# that way we dont need to do a custom deduplication id
# it matches: 00-00-00 00:00:00 | 0000-00-00 00:00:00 | 09-05-22 08:16:00 | 1970-00-00 00:00:00 | 20090522081600
# it does not match:  2009-13:01 00:00:00 | 2009-12-32 00:00:00 | 2002-12-31 24:00:00 | 2002-12-31 23:60:00 | 02-12-31 23:00:60
import re
evt.summary = re.sub("(\d{2}|\d{4})(?:\-)?([0]{1}\d{1}|[1]{1}[0-2]{1})(?:\-)?([0-2]{1}\d{1}|[3]{1}[0-1]{1})(?:\s)?([0-1]{1}\d{1}|[2]{1}[0-3]{1})(?::)?([0-5]{1}\d{1})(?::)?([0-5]{1}\d{1})", 'DATETIMEREMOVED', evt.summary)