How To Parse Gmail Emails November 2022 · ian
With Gmail being such a common email service it is no surprise that many are trying to understand how to parse Gmail emails. If that is what brought you here, then you are in the right place. We are going to show you how to parse your Gmail emails into a format that can be used with other platforms. Let’s jump in to it.
How to parse Gmail emails
The first thing to note is that Gmail is just a mailbox, it uses your email address and will collect all emails that are sent to your specific email address. So for example if my email is firstname.lastname@example.org, then any email sent to email@example.com will show up in my gmail inbox. Now if we want to take one of those emails and parse it, there are a couple paths we can take.
The first is to tell the sender of that email to also send it to an additional email address that we’ll setup here at HostedHooks. What this will do is send 2 copies of the email, one to your gmail inbox and then another copy to your custom HostedHooks inbox. In order for HostedHooks to parse the Gmail email, we will need a copy of it sent to our system. Adding an additional custom email address is one way of doing that.
If you are in a position where you can’t ask the sender to also send the email to an additional custom HostedHooks email address, then we can do something called forwarding. This is a gmail feature that will watch for specific types of emails (based on a filter we setup) and forward only those emails to another email address, which in our case would be your custom HostedHooks email address. This process is covered quite extensively from the Gmail side so we’ll link to that documentation here and how to create a filter here.
Setting up your custom email parsing inbox
Regardless of which option you choose, you will need a custom HostedHooks email address to either give the original email sender or to use as the gmail forwarding address. Let’s walk through how to set that up. You will first need a free HostedHooks account, which if you don’t have one, you can create one here. Follow the instructions and once ready, head over to your HookHelpers.
Click the “Create a new hook helper button” which will generate a custom HookHelper for you. This hookhelper will act as your inbox for the emails that you want to parse. You will see on the middle of the page of that new HookHelper, a custom email address.
This email address is yours and will accept any email that you send it.
So now that we have your email address setup, we would use this email address to send a 2nd copy of the gmail email that you want to parse to our system. In the scenario where you have access to the sender, you would simply tell the sender that you also want the email in question to also be sent to
firstname.lastname@example.org. Once that is setup, any new email that comes through will also show up in your HookHelper Inbox. It will look something like this.
Now if you don’t have access to the email sender, you will need to use the Gmail filters and forwarding process that we mentioned above. The forwarding email would be your custom email address from your HookHelper inbox (
email@example.com). Once that is setup, whenever the same type of email is received, it will be forwarded (based on your filter rules) to the HookHelper inbox.
Setting up parsing rules for Gmail emails
Now that you have your emails being sent to the HostedHooks platform, we will ingest and store every email for you. The next step to parsing Gmail emails is to setup parsing rules to turn your unstructured email data in to structured data that can be used on other platforms.
Now that we have a request, we need to setup our parsing rules. Go ahead and click on “Parsing” in the nav bar.
This next page is where you will be building the structure of the data that you want to convert your emails into. For every parsed email you want some output of data to send to a third party. We will build that data structure through these rulesets. The output of each ruleset is a key, value pair, where the key is the label of the rule and the value is the extracted value from your email, based on the parsing rules you setup.
So for example we want to create a payload that looks like the below.
Let’s walk through how to do that by showing how to set up the firstname key value pair. Start by creating a new ruleset. The label of the ruleset is going to be the key of the key value pair that we end up creating, so in our case we want the label to be “firstname”. Next we will keep the body selected for the content type since the data that we are looking to extract is found in the body of the email. You can ignore the sample email for now. Hit save and let’s move on to building the parsing rules.
We start with the raw text at the top and we ultimately want to end up with just the first name data point extracted. To accomplish this we first create a 'Capture all after the match' rule, which will take the input, in our case 'First Name'. The parsing engine will find the first occurrence of ‘First Name’ and parse out all text following that match, just like the rule says.
That gives you the following output.
Now we want to create a rule that will capture all the data before the 'Last Name' value. We do this by creating a 'Capture all before the match' rule and using the keyword 'Last Name:' as the input. With this rule, the parsing engine will locate the first instance of 'Last Name:' and parse all the text before that match.
That gives you the following output.
In some cases, we will have some extra whitespace or new line characters surrounding our final output. To solve for this we have a 'Remove leading and trailing whitespace' rule which will remove all of that for you. You may want to add that as precautionary rule, but that is up to you.
In our case here our final output looks good and we are capturing the dynamic data that we intended for at the beginning. Our ruleset is good to go for the 'First Name' value.
For the sake of simplicity, we aren’t going to run through every ruleset in this article, but essentially you would repeat the same process above for every data point that you are looking to extract. If you have any questions about how to parse some content, come join us in our Discord and we’ll be happy to help.
Once done building your ruleset you will see your final sample output on the rulesets index page (see below). The "current payload" is the structure that each new parsed email will get turned into. Obviously, the data will change as it gets dynamically added based on the contents of the email.
Now that our parsing is complete, we would need to setup a webhook to send our parsed email data somewhere. We won’t cover that step in this article, but here are some resources to help you with that setup.
To recap what we’ve covered in this article:
- We setup a custom email inbox to receive and ingest gmail emails
- We reviewed two strategies to get a copy of your gmail emails sent to HostedHooks
- We reviewed how to setup parsing rules for your gmail emails
With the above steps you should now be able to parse your gmail emails and turn them into a format that can be used by other platforms.
We hope this article was helpful, let us know if you have any questions!