Home
LiveJournal Client Discussions - LJ Security script [entries|archive|friends|userinfo]
LiveJournal Client Discussions

[ userinfo | livejournal userinfo ]
[ archive | journal archive ]

LJ Security script [Jun. 19th, 2004|09:26 pm]
Previous Entry Add to Memories Tell a Friend Next Entry

lj_clients

[altamira16]
So there has been the old LJ security script that was going around a few years ago written by some guy named Kevin, and I liked what it did, but it looked like it hit the LJ Server more than Ike hit Tina.

I changed it to speed it up and to be kinder to the LJ Server, and I was wondering if someone could look over it and point out anything that may still need work. This is the first time I have done anything that does server/client stuff in Perl.

Here is my code.

Thank you.
linkReply

Comments:
[User Picture]From: [info]marksmith
2004-06-19 10:13 pm (UTC)

(Link)

Never ever use getevents</i> with a selecttype of day if you intend on uploading the data back to the server.

Look at how the jbackup.pl script downloads entries (it's in CVS, livejournal/src/jbackup/jbackup.pl), or go back in this community a few months and look for a post I wrote that details the proper way to download entries.

If you download entries day by day, you hit the server a bunch more than you need to, AND if the server has data that isn't properly encoded, instead of sending it to you raw, it will send you the entry--except with the subject and body replaced with "(cannot be displayed)", which your code will then upload to the user's journal and overwrite their entry.

You should also read up on how Unicode works, how to set the proper version number in the protocol, and test your code against journals that have properly AND improperly encoded data.

There are just so many things that can go wrong taking someone's journal and sending it back to the server for resubmission. Way too many things.
[User Picture]From: [info]altamira16
2004-06-19 10:37 pm (UTC)

(Link)

Ok, from your older post, it looks like you feel that syncitems is the superior way to do this, but what if someone wants to turn a particular period of posts private without changing their entire journal? I am not sure if it is beneficial to the user to download the whole journal if they are interested in a subset of the posts.

Do you have any better suggestions than downloading and resubmitting the data for changing the privacy level of a group of posts?
[User Picture]From: [info]evan
2004-06-20 08:49 am (UTC)

(Link)

download the entire journal, then reupload the subset you want to change.
i've been on lj for four years: a day-by-day download would take over a thousand separate requests, while a full syncitems download is only ten or so.
[User Picture]From: [info]altamira16
2004-06-20 12:36 pm (UTC)

(Link)

Spiffy, thank you. :)
[User Picture]From: [info]elo_sf
2004-06-20 07:07 am (UTC)

can you elaborate on the "(cannot be displayed)" issue

(Link)

Why exactly is the server generating that rather than an error? And what is causing it?

I am sending the Unicode version in my script and sometimes have seen this problem... trashing old entries...
[User Picture]From: [info]marksmith
2004-06-20 10:19 am (UTC)

Re: can you elaborate on the "(cannot be displayed)" issue

(Link)

Because the day selecttype is meant to be used to DISPLAY entries only. Example:

You create a calendar widget in your program. You use the getdaycounts mode to figure out how many posts are there per day, and you populate these numbers on the widget.

Your user clicks on a day that says '1'. But, since the entry has an invalid encoding, the server spits out the entry with the subject and body modified so that the user knows the entry is there--it just can't be displayed through a client that doesn't support Unicode properly.

Much better than throwing an error. This also demonstrates what the selecttype of day is for--viewing entries on a specific day, just to show users what happened that day. That's it.

It is not meant for backing up journals, and especially not for downloading entries in order to reupload them to the server.

Use syncitems, please. The servers, and your users, will thank you. :)
[User Picture]From: [info]elo_sf
2004-06-20 10:26 am (UTC)

Re: can you elaborate on the "(cannot be displayed)" issue

(Link)

I'm still not sure I understand why the server is doing what it is doing, Evan's post on a suggestion of how to use synitems makes sense, but whew-that is a much harder way to handle the problem.
[User Picture]From: [info]marksmith
2004-06-20 12:06 pm (UTC)

Re: can you elaborate on the "(cannot be displayed)" issue

(Link)

Eh. Then don't worry about why it does it that way, just know it does. Never use selecttype of day when you are going to reupload the data. Always always use syncitems.

It may be "harder", but it's going to reduce hits to the server by several orders of magnitude, which is nice and also means your bot isn't likely to get itself banned by being dumb. :)
[User Picture]From: [info]hythloday
2004-06-21 05:56 am (UTC)

Re: can you elaborate on the "(cannot be displayed)" issue

(Link)

It would be really nice if that was documented in the API.
[User Picture]From: [info]altamira16
2004-06-20 12:44 pm (UTC)

Unicode

(Link)

Like I wrote before, I am clueless about this stuff. How do you use Unicode, and how do you handle stuff that is not properly encoded?
[User Picture]From: [info]decadence1
2004-06-22 07:41 am (UTC)

Re: Unicode

(Link)

This document on unicode may/may not help: http://www.livejournal.com/community/web_ui/1947.html
[User Picture]From: [info]quirrc
2004-06-28 02:49 am (UTC)

(Link)

could you make it clear: are selecttype multiple and syncitems equvalent (or almost equivalent) with regard to server load?
[User Picture]From: [info]marksmith
2004-06-28 08:04 am (UTC)

(Link)

There is no selecttype:multiple. I don't understand your query.
[User Picture]From: [info]quirrc
2004-06-28 10:48 am (UTC)

(Link)

it's in the ljprotocol.pl. or it's not activated? i did not tried it but i thought it works, just not documented
[User Picture]From: [info]marksmith
2004-06-29 01:11 pm (UTC)

(Link)

I never noticed that.

I suppose you could use that--you could use syncitems to get the jitemids, then use selecttype = multiple to get them. You'd have to use the XML-RPC protocol mode, though. (Not that that's a problem.)

Looking at the code, the difference between doing a selecttype of lastsync and a selecttype of multiple isn't all that different. I think that a multiple would probably be easier on the server, but I don't know that it makes enough of a difference to warrant recommending it over the other method.