Skip to main content
Topic: Requesting help for batch import from a mailbox based system (Read 1569 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Requesting help for batch import from a mailbox based system

We are a non-profit organization planning to introduce Elkarte for the community. In the past we were using a mailbox based listserv system. In simple words, the system creates a mailbox file per month and allows users to post and reply via email. Now, the task is to import the existing mailboxes content to a particular category in our Elkarte system.

Technically the mailboxes can be exported in a text based format, following this scheme:

Code: [Select]
=========================================================================
Date: <timestamp>
Reply-To: <reply address>
Sender: <sender address>
From: <User given/last name>
Subject: <Subject>

... message body

=========================================================================
Date: <timestamp>
Reply-To: <reply address>
Sender: <sender address>
From: <User given/last name>
Subject: <Subject>

... message body

=========================================================================
Date: <timestamp>
Reply-To: <reply address>
Sender: <sender address>
From: <User given/last name>
Subject: <Subject>

... message body

The same information could also be extracted as single file, i.e. one file per thread. How could I batch import this information into the Elkarte system? Is there any kind of API or services interface that can be called? I have studied the OpenImporter for a while, and not sure if I could extend its concept to do such a thing.

Any pointer into the right direction is highly appreciated! On a lighter note, the respective registered users of the old system I could have as a separate file.

Re: Requesting help for batch import from a mailbox based system

Reply #1

One of the "issues" with listserv / email style groups is that normally as long as you had a valid email you could post, so users often ended up with multiple "ids"  That is not so bad but it does impede things for an import as you will end up with extra id's and lost user history.

The second issue with  listserv / email  groups, and again this depends on what was used on the backend, is that it can be difficult to determine the correct threading of posts.  Sometimes they will have a header key that allows you to see how things were threaded, other times you just have to go by the message subject and join on that.

If your user list correlates well to the files "From: <User given/last name>", meaning you can find them easily, then I would start by creating all of the id's on the system.  I'd then write a script to go message by message in your file and use some of ElkArte's functions to parse the message and post the message. 

There is no API so to speak, but action_pbe_post in Emailpost_Controller is a place to start.  You would feed that function with an individual raw email that you extract from the list file you have, it would do the rest and make a post.  You would have to come up with the board to post to and do some work on how to thread things, but that function should at least be a start. 

That function is really designed to work specifically with mails from an elkarte system which have a topic key built in, thats how it knows how to thread.  You would have to hot wire that to get what you want, but its doable I believe.

Re: Requesting help for batch import from a mailbox based system

Reply #2

Quote from: Spuds – One of the "issues" with listserv / email style groups is that normally as long as you had a valid email you could post, so users often ended up with multiple "ids"  That is not so bad but it does impede things for an import as you will end up with extra id's and lost user history.

The second issue with  listserv / email  groups, and again this depends on what was used on the backend, is that it can be difficult to determine the correct threading of posts.  Sometimes they will have a header key that allows you to see how things were threaded, other times you just have to go by the message subject and join on that.

If your user list correlates well to the files "From: <User given/last name>", meaning you can find them easily, then I would start by creating all of the id's on the system.  I'd then write a script to go message by message in your file and use some of ElkArte's functions to parse the message and post the message. 

There is no API so to speak, but action_pbe_post in Emailpost_Controller is a place to start.  You would feed that function with an individual raw email that you extract from the list file you have, it would do the rest and make a post.  You would have to come up with the board to post to and do some work on how to thread things, but that function should at least be a start. 

That function is really designed to work specifically with mails from an elkarte system which have a topic key built in, thats how it knows how to thread.  You would have to hot wire that to get what you want, but its doable I believe.

Thanks for your explanation. Guess it's a good idea to use the "email entry door" to the system. That sounds promising to me, and I will certainly give it a try.

I believe the hard wired boards and topics will not be a problem. These messages shall be imported more or less for historic reasons. They will have a dedicated space in the forum, and will not be actively used for future discussions. The old system will be switched off and we want to keep a record of the past, so to speak.

On a lighter note, the software system used is "Listserv" from a company called L-Soft.

[SOLVED] Requesting help for batch import from a mailbox based system

Reply #3

I want to report back on the issue ... good news first: it has been resolved!

Here is what I did:
  • Adjusted source code related to mail management, as stated here.
  • Created a new php file in the root folder, following the general pattern of "emailtopic.php".
  • Wired that new file with an email address, piping received email to it.

The code of this new module does not take the content of the trigger email. Instead it is reading the content from file that is structured like mentioned earlier. Obviously I first had to place my mail digest file(s) on the server. The code splits the digest file message by message and uses the system functions to post the topic. At the same time I added code to tweak the content a little, e.g. added header and footer to each message. Also it tricked the posting date, so that the migrated messages can be sorted chronologically by the timestamp of the original posting date.

As mentioned the board is "hard wired" where the message is posted to. Also I did create a virtual user "importer" that served as the source for any migrated topic.

For me that did the job. Just wanted to describe the approach, maybe others get food for thoughts from this. Happy to discuss details!