Beefy Boxes and Bandwidth Generously Provided by pair Networks RobOMonk
Syntactic Confectionery Delight
 
PerlMonks  

The Monastery Gates

( #131=superdoc: print w/ replies, xml ) Need Help??

Donations gladly accepted

If you're new here please read PerlMonks FAQ
and Create a new user.

New Questions
Server-side Websocket implementations in non-event driven HTTP Server Environments
2 direct replies — Read more / Contribute
by unlinker
on Jun 08, 2013 at 15:16

    I am trying to understand implementations/options for server-side Websocket endpoints - particularly in perl using PSGI/Plack and I have a question: Why are all server-side websocket implementations based around event-driven PSGI servers (Twiggy, Tatsumaki..etc.)?

    I get that websocket communication is asynchronous, but a non-event driven PSGI server (say Starman) could spawn an asynchronous listener to handle the websocket side of things. I have seen (but not understood) PHP implementations of Websocket servers, so why cant the same be done with PSGI without having to change the server to an event driven one?

Compact data classes
3 direct replies — Read more / Contribute
by creeble
on Jun 07, 2013 at 19:13
    I have a Class::Struct (using array form, fwiw) containing about a dozen string members. I build a database of these by sorting them into an array.

    The total length of the strings used to create this database is about 2MB; about 150 bytes of actual data times about 15,000 records, say.

    As an array of Class::Struct objects, this takes up over 15MB in ram. I am unfortunately running on an embedded processor with limited ram.

    How could I store these strings in a more compact way, and still use a convenient accessor function (preferably, the same one, namely a class function) to retrieve the data? The db is read-only once it has been sorted.

    Or, to perhaps put it another way, what would it take to write something like Class::Struct in XS with a non-growable string as the only data type?

    Or am I already missing something from CPAN? What would be a more compact form to store

Space prepended to Array Print
5 direct replies — Read more / Contribute
by JockoHelios
on Jun 07, 2013 at 14:48
    I've run across an odd little quirk, and I'm wondering if anyone has seen this before.

    I'm loading a text file into an array. The text file has data saved from a previous array, example:

    20130516,530,730
    20130516,731,1100
    20130516,1101,1200

    To confirm that the data is loaded into the array, I printed out the loaded array using this code:

    print "@LoadedArray";

    All the data prints out as is it is in the text file, with one exception. After the first row of the loaded array is printed, each subsequent row has one extra space in front of it. Example, using underscores to simulate the added space:

    20130516,530,730
    _20130516,731,1100
    _20130516,1101,1200

    When I print the array using the following code:

    foreach $ArrayRow( @LoadedArray ) { print "$ArrayRow"; }

    the added spaces don't appear - the data prints as it is in the text file.

    I don't think this is really a problem, just a curiosity I found (fingers crossed).
    The only "problem" is that stuff like this tends to stick in my brain until I know why it happened :)
    Dyslexics Untie !!!
Storing regex values in an array and processing the array
3 direct replies — Read more / Contribute
by diamondsandperls
on Jun 07, 2013 at 13:36
    my @patterns = ( qr/Record contentId\=\"\d+\"/ ); my $final_file = "C:/Users/$sso/Desktop/archer_$searchid.txt"; open(my $final_fh, '>', $final_file) or die "Failed to open $final_file - $!"; open(my $input_fh, '<', $output_file) or die "Failed to open $output_file: $!"; while (<$input_fh>) { foreach my $pattern (@patterns) { print {$final_fh} $pattern, "\n"; } } close $final_fh;

    With the following code when printing I get the following output. I am thinking I am doing something wrong with the following in storing regex matches, but not sure.

    my @patterns = ( qr/Record contentId\=\"\d+\"/ );

    (?-xism:Record contentId\=\"\d+\")
    (?-xism:Record contentId\=\"\d+\")
    (?-xism:Record contentId\=\"\d+\")
    (?-xism:Record contentId\=\"\d+\")
    (?-xism:Record contentId\=\"\d+\")
    (?-xism:Record contentId\=\"\d+\")
    (?-xism:Record contentId\=\"\d+\")
    (?-xism:Record contentId\=\"\d+\")
    (?-xism:Record contentId\=\"\d+\")
    (?-xism:Record contentId\=\"\d+\")
    (?-xism:Record contentId\=\"\d+\")
    (?-xism:Record contentId\=\"\d+\")

Modifying SelfLoader to save eval() text
1 direct reply — Read more / Contribute
by rockyb
on Jun 07, 2013 at 13:13

    I noticed the following problem in using the perl debugger, Devel::Trepan, with code that uses the SelfLoader perl module.

    SelfLoader uses eval() to install subroutines at run time. The perl debugger would like to have access to the text of the eval string. With this, when one is debugging into one of those procedures the debugger can show you were you are. The two things that defeat this in SelfLoader is the use of lexically-scoped (or my) hash %Cache of such added text which it deletes just before running the code.

    The general outline of the SelfLoader is:

    package SelfLoader; ... my %Cache; # private cache for all SelfLoader's client packages ... AUTOLOAD { our $AUTOLOAD; # $AUTOLOAD is fully-qualified function name, e.g. MyPackage::fn ... # set $Cache{$AUTOLOAD} to file text from __DATA__ to __END__ ... delete $Cache{$AUTOLOAD}; goto &$AUTOLOAD; }

    So basically, when debugging I'd like to rewrite the above to change my %Cache to our %Cache and remove the delete $Cache{$AUTOLOAD};

    Alternatively, the routine that updates %Cache is called SelfLoader::_add_to_cache() so that could be augmented to save to another hash somewhere. But the problem here is that it still needs to update the lexically-scoped %Cached hash.

    The two techniques that come to mind are monkey-patching and the Decorator pattern. Alas, because of the specifics of how SelfLoader works which I won't go into here (and am not totally positive I understand), I don't see how to do either.

    Thoughts on how to address? Thanks.

Hash dereferencing in a loop
2 direct replies — Read more / Contribute
by willjones
on Jun 07, 2013 at 12:32

    I get the following error message when I try to compile this code with use strict turned on:

    sub failsToCompile { my $hashRef = shift; my $val; foreach my $key (keys %$hashRef) { $val .= $hashRef{$key}; } return $val; }
    Error Message: Global symbol "%hashRef" requires explicit package name ...

    I finally discovered a work around. When I rework the code a little to be like the following sub then the error message goes away. So, I know how to get around it now, but my question is... why? Why do I have to do that? Why can't I refer to the hash that the hash reference points to in the for loop without dereferencing it first and assigning it to another variable as I did in the following sub? Also, what is making perl think this is a global symbol? Thanks in advance for input/explanations.

    sub worksCompilingFine { my $hashRef = shift; my %hash = %$hashRef; my $val; foreach my $key (keys %hash) { $val .= $hash{$key}; } return $val; }
Multithreading leading to Out of Memory error
2 direct replies — Read more / Contribute
by joemaniaci
on Jun 07, 2013 at 11:58

    So I have a new multithreading implementation that should be smooth running and for the most part it is. It wasn't until I tried evaluating 231 Gb of files that I started getting "Out of Memory!" errors and having the program die. I have been over everything three times now and still cannot figure it out. So here is what I have...

    use threads('yield','stack_size' => 64*4096, 'exit' => 'threads_only', + 'stringify'); use Thread::Queue; use various others(File::..., DBI.... my @FoundFiles = subroutine to get all of the applicable files(and the +ir directories). my $Threads = 8; my $workq = Thread::Queue->new(); $workq->enqueue(@FoundFiles); $workq->enqueue(undef) for (1..$Threads); threads->create('executeall') for (1..$Threads); sub executeall { while(my $i = $workq->dequeue()) { last if $i eq undef; if($i =~ /filetypea/) { parseitthisway($i); } if($i =~ /filetypeb/) { parseitanotherway($i); } .... .... } threads->detach(); }

    Now in my googling I have come across several references talking about perl threads maybe holding on to excess data over time. As you can tell, the only real global piece of data I am using is the queue. Therefore, all the real data is being built up in the individual parse subroutines, meaning the perl garbage cleanup should be taking care of that once the subroutine returns. Unless it's bugged or something. The only thing I can think of is destroying and recreating my threads every 200-300 file iterations.

Dereference inside a regex
3 direct replies — Read more / Contribute
by tobias_hofer
on Jun 07, 2013 at 10:21
    Hi all,

    As it is possible to use variables in a regex like ~m/$myvar/;. Then why it is not possible to do something like that:
    my $state_smybol = { 1 => 'Image component sizes', 2 => ' Code \(inc\. data\) RO Data RW Data ZI Dat +a Debug Object Name', }; ... if( $MapFile->[$iterator] =~m/$state_symbol->{1}/){..
    The regex has uninitialized values then.. ?
    Why?
    Actually i do not want to create an additional temporary variable for storing the value from the hash and then doing the regex..

    Any help is highly welcome!

    Best regards!
    Tobias
Memory efficient statistical distribution class
1 direct reply — Read more / Contribute
by Dallaylaen
on Jun 07, 2013 at 07:47

    I'd like to analyse some data (say, web-service response times) and get various statistical info, mainly percentiles/quantiles and presence of outstanding values.

    I know about Statistics::Descriptive, however, I don't want to store all the data in memory. On the other hand, having my results off by a few % would be fine, I only care about huge differences.

    So I came up with the following idea: create an array of logarithmic buckets, and count data points landing in each bucket. Having the data spread across 6 orders of magnitude and guaranteed precision of 1% still leaves me with 6 * log 10 / log 1.01 =~ 1400 buckets which is perfectly fine (36 kb of memory, given current Perl's scalar size).

    Counting percentiles is simple - just add up bucket counters until $sum exceeds $percentage * $total_count.

    However, before I start writing actual code, I would like to ask which memory efficient statistical modules and algorithms already exist (for Perl, of maybe other languages).

    Looks like there's a similar Stackoverflow question, and there's similar method proposed in one of the answers. Haven't found a ready-made Perl implementation, though.

Unicode again, in Win7 cmd
3 direct replies — Read more / Contribute
by hdb
on Jun 07, 2013 at 07:25

    I am not able to get unicode to work under Win7 command prompt. I am changing the code page to 65000 and want to print the diagonals from Re^2: Random maze generator. I cannot post all the permutations I have tried from the various Perl unicode tutorials and FAQs. So here is my skeleton code and I would like to know which combination of encode, decode, etc makes this work and what is not needed to do. Many thanks

    use strict; use warnings; use Encode; use utf8; binmode STDOUT, ':encoding(UTF-8)'; my $enc = "utf-8"; system( "chcp 65000" ); # from node http://www.perlmonks.org/?node_id=843144 my $ne = "\xe2\x95\xb1"; my $nw = "\xe2\x95\xb2"; print $ne, $nw, "\n";
Pull users with multiple search
3 direct replies — Read more / Contribute
by johnprince1980
on Jun 07, 2013 at 01:33
    Hi, I have a typical requirement to find users having at least three occurrence in a log within an hour. Please guide me how can I accomplish this.
    [04/Jun/2013:13:06:13 -0600] conn=13570 op=14 msgId=13 - BIND dn="uid= +xyz123,ou=People,o=xyz.com" method=128 version=3 [04/Jun/2013:15:06:13 -0600] conn=13570 op=14 msgId=15 - RESULT err=0 +tag=101 nentries=48030 etime=139 SRCH=Q

    Basically, we need to find any user ( ie uid=xyz123), getting "SRCH=Q" in a particular connection have more than three occurrence within an hour. If you see the logs, they are related with "conn=13570". In brief, here is the logic : - Get the "SRCH=Q" occurence. - Get the associated conn #, go back and get the bind user. - Carrying bind user, search for "SRCH=Q" occurrence, if > 3, run add group command.

    Thanks, JPrince
Rand() with ?log10? distribution?
2 direct replies — Read more / Contribute
by BrowserUk
on Jun 06, 2013 at 22:17

    I want a rand function that produces lots of small numbers and a few big ones.

    Say twice as many 0-9, as 10..99. And twice as many 0-99 as 100-999,. Etc.

    Feels like log10 should be in there somewhere. Or maybe 10**something.

    Note: I'm not looking for a function embedded in one of the big Math::* C libraries, as I will almost certainly want to tweak the distribution.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Log In?
Username:
Password:

What's my password?
Create A New User
Chatterbox?
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (7)
As of 2013-06-09 21:21 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How many continents have you visited?









    Results (336 votes), past polls