Skip to main content
Topic: Attachment hashing (Read 2991 times) previous topic - next topic
0 Members and 2 Guests are viewing this topic.

Attachment hashing

The attachments are currently stored as attach_id-hash.elk Is there a reason for that? My guess is security through obscurity? In my recent sites we follow the laravel model of having a /public/ folder for accessible stuff and everything else is in the directory above. Our sites are all in git and we just use Apache virtual directory to point at where ever the public folder of the repo is on the server. It's pretty sweet. Then only public is exposed and we keep all kinds of development and testing resources in the non accessible upper folder.

So for example attachments would be one directory above public and out of the web root. The script reads them out and dumps to browser. Then instead of all the silly hashing we just use base 36 numbering to maximize the address (name) space. Any security is implemented in the access control layer (eg, who can read, for PMs maybe recipient, sender, and admin). It looks like the images are already dumped via php so this isn't a performance hit.

I'll have a look at how feasible as I get our test migration into git and deployed on a live server.

Re: Attachment hashing

Reply #1
Its just security, same as the .elk type which is AFAIK not used by any executable.   So someone uploads a file and its gets hashed to some name they can't find, or should not be able to find.  Then the extension is also to prevent it from being seen as an executable or any file used by an installed program.   I think that is all the reasons @emanuele ?  There are some less than secure setups around so some things were done just to protect people from themselves.
Squish squish. squish, squish, squish.
Find a bug,
Make a wish.

Re: Attachment hashing

Reply #2
That's my suspicion too. I fully imagine it's inherited from SMF. It's very 2008 php. I have not upgraded to elkarte 1.1.x so maybe it has changed.

Re: Attachment hashing

Reply #3
Nope 1.1 is still the same-ish ... 2.0 is where we have been discussing overhauling the attachments code which has kind of defied refactoring and may just needs to be redone.  Its really true spaghetti.
Squish squish. squish, squish, squish.
Find a bug,
Make a wish.

Re: Attachment hashing

Reply #4
Nope 1.1 is still the same-ish ... 2.0 is where we have been discussing overhauling the attachments code which has kind of defied refactoring and may just needs to be redone.  Its really true spaghetti.
 
 Any chance attachment resizing could become core? Pretty please? ;D

Re: Attachment hashing

Reply #5
That is my intention to do :D
Squish squish. squish, squish, squish.
Find a bug,
Make a wish.


Re: Attachment hashing

Reply #7
Then the extension is also to prevent it from being seen as an executable or any file used by an installed program.  I think that is all the reasons @emanuele ?
 IIRC the extension was also added to avoid problems with Filezilla (as explained on SMF forums). Yep, I was also affected by this problem years ago and I lost some attachments and avatars. :(
sorry for my bad english

Re: Attachment hashing

Reply #8
Yep, pretty much what has been said.
I would say that that the hashing was also to reduce the cases of naming conflicts? Dunno.

I think your case is an exception in respect to security measures. In most of the cases, the attachments directory is just in plain sight.
It's the same with the cache directory.
So I don't see this changing any time soon... What do you think @Spuds ?
Bugs creator.
Features destroyer.
Template killer.

Re: Attachment hashing

Reply #9
If security isn't a concern, I would definitely suggest that naming the files base_convert($file_id, 10, 36).'.elk' gives a lot more address space, eliminates any chance of naming conflicts, makes debugging a bit easier, and only the file_id is required to locate the .elk file in the attachments folder which might help untangle things a bit. It just looked ugly to me when I was doing the conversion :)

 

Re: Attachment hashing

Reply #10
heh.
Unfortuantely, security from our side is always a concern.
In terms of complexity the attachments code is already (as Spuds said) quite a bunch of spaghetti rolled up randomly. If we add also some kind of "custom naming" on top of it, depending on security concerns of the owner of the forum... well, I already regret giving 3 options to organize folders. :-\
So we have to account for people that keep their attachments and caches open to the public.

That said, a base_convert doesn't really add anything to the address space unless I'm misunderstanding: if $file_id in your example if the attach_id of the database, then either it be the number from the db or the version in base 36, the potential for conflicts remain the same (that is... almost 0, because attach_id is an autoincrement, so I would not expect to have the same number returned twice... at least in mysql, in postgre since we use a function I would expect in some remote edge cases it may actually happen, but who knows, I'm not particularly knowledgeable of databases).

In terms of "debugging" it boils down to what and how you are debugging, if it's just to know if an attachment is created, you can check if a file with a certain attach_id is present. If it is it has to match the one in the database (where, incidentally, the hash is stored (file_hash), so if you need it it's just another round up of fetching the info from the attachments table).

Just challenging, of course. ;)
Bugs creator.
Features destroyer.
Template killer.