Perl Review

perl

Subroutines that have been defined earlier in a program may also be called without parenthesizes around the argument list.

A subroutine may be called using the "&" prefix. The "&" prefix is optional in modern Perl, and so are the parenthesizes if the subroutine has been pre-declared. However, the "&" is not optional when you are just naming the subroutine (such as when it is used as an argument to defined() or undef()). Nor is it optional when you want to do indirect subroutine call with subroutine name or reference using the &$subref() or &{$subref}() construct.

If a subroutine is called using the "&" prefix, the argument list is optional (arguments are whatever already in @_). If the "&" is not used, and arguments are specified using parenthesizes, then the @_ will not be setup.

foo()        // pass the null list
&foo()    // the same
&foo;        // get current arguments from @_ like foo(@_)
foo;        // like foo() if and only if foo is pre-declared.

Not only the "&" make the argument list optional, it also disables any prototype checking.

Lexical scopes of control structures are not bounded precisely by braces that delimit their control blocks. Control expressions are parts of the scope too. Thus, in the loop:

while (defined(my $line = <>)) { $line = lc($line); }
continue { print $line; }

the scope of the $line extends from its declaration throughout the rest of the loop construct (including the continue clause), but not beyond it.

By convention, variables starting with an underscore are private to the package which define it. The $_ is still a global variable.

When unqualified, identifier STDIN, STDOUT, STDERR, ARGV, ARGVOUT, ENV, INC, and SIG are forced to be from the main package. Within a package, you can declare identifier X (without qualifying it) and use it (without qualifying) in function within the same package. This not the case with the above identifier.

If you have a package name s, m, or y, you cannot use the qualified form of an identifier, because it will be treated as a pattern match, substitution or transliteration.

Eval()ed strings are compiled in the package in which the eval() was compiled.

auto use MODULE => qw(sub1 sub2 sub3)    // defer require MODULE until someone call one of
                                // the specified subroutines

blib    // manipulate @INC at compile time to use MakeMaker's uninstalled version of a package
diagnostics        // force verbose warning
integer        // compute arithmetic in integer instead of double
less            // require less of something from the compiler
lib    // manipulate @INC at compile time
locale    // use or ignore current locale for built-in functions
ops        // restrict named opcodes when compiling or running perl
overload    // overload basic perl operations
re        // alter behavior of regular expressions
sigtrap    // enable simple signal handling
strict        // restrict unsafe constructs
subs        // predeclare subroutines
vars        // predeclare global variable names

Elements of @_ array are special in that they are not copies of the original argument. Rather, they are aliases for those arguments. That means that if values are assigned to $_[0], $_[1], $_[2], etc, each value is actually assigned to the corresponding argument.

The arrow operator (->) take reference on its left and either an array index (in square bracket) or a hash key (in curley braces) on its right. The (->) operator can also be applied to subroutine reference.

\$scalarName: reference to a scalar $scalarName    ${$x}
\%hashName: reference to a hash %hashName %{$x}
\@arrayName: reference to an array @arrayName @{$x}
\&subName: reference to a subroutine subName &{$x}

$x->[1]    // Access element who index is 1.  $x is a reference to an array.
$x->{'key'}    // Access a hash element who key is 'key'.  $x is a reference to a hash.
$x->methodName()    // Execute a method or a subroutine.  $x is an object.

Result of ref($x):    // $x is a reference of some type.
"SCALAR"
"ARRAY"
"HASH"
"CODE"
"IO"        // $x is a reference to an IO handle
"GLOB"    // $x is a reference to a typeglob
"REGEXP"    // $x is a reference to a pre-compiled pattern
"REF"    // $x is a reference to another reference
$row1_ref = [1,2,3]; // reference to anonymous array
$row2_ref = [2,4,6];
$row3_ref = [3,6,9];
$table = [$row1_ref, $row2_ref, $row3_ref]
$table->[$x][$y]

$association = {"cat" => "nap", "dog" => "gone", "mouse" => "ball"}; // reference to an anonymous hash

In Perl, a closure is a subroutine that refers to one or more lexical variable declared outside the subroutine itself. If we define the closure within a block:

{
    my $name = "Damian";
    sub print_myname {
        print $name,"\n";
    }
}

then the definition of that subroutine confers eternal life on otherwise damned variable $name.

Closure are means of giving a subroutine its own private memory, variables that persist between calls to that subroutine and are accessible only to it. Two or more closures can share the same private memory or state (encapsulation).

If a typeglob is assigned a reference of any kind then only the typeglob slot corresponding kind is replaced.

What is a typeglob?

*dick = *richard;

The * is a special typeglob notation, referring to all identifiers with given name. This statement causes all variables, subroutines, formats, file handle, and directory handles accessible via identifier richard, also be accessible via identifier dick.

*dick = \*richard;

Alias only $richard. Make $richard and $dick the same variable, but leave @richard and @dick separate arrays.

%somehash = ();
*somehash = fn(\%anotherHash);

If the function fn(\%anotherHash) returns a hash reference, then you would not have to deference the return value, you can use it with somehash.

Another use of typeglob is for making constant scalar:

*PI = \3.14159
*foo{PACKAGE}    // The package name where foo come from
*foo{NAME}    // The name of foo from the package.  These may be useful in
            // subroutine that get passed typeglob as argument, and
            // you need to find the package and name of the argument

*foo{THING}    // This notation is used to obtain reference to individual
            // elements of *foo.  Example: *foo{HASH} obtain the
            // hash element reference of *foo.  *foo{THING} return
            // undef if that particular THING hasn't been used yet,
            // except in the case of scalars.  *foo{SCALAR} return
            // a reference to an anonymous scalar if $foo has not been
            // used yet.
sub identify_typeglob {
    my $glob = shift;
    print "You came from ", *{$glob}{PACKAGE}, '::', *{$glob}{NAME},"\n";
}
package Some::Module;
use strict;
use 5.004
BEGIN {
    use Exporter();
    use vars qw($VERSION @ISA @EXPORT_OK); // push up
    $VERSION = 1.00;
    @ISA = qw(Exporter AutoLoader)
    @EXPORT = qw(&func1 &func2 &func3); // should be avoid, no comma
    @EXPORT_OK = qw($var1 %hashit &func4);
}
use vars @EXPORT_OK;

# non-exported package globals follow:
use vars qw(@more $stuff);

# initialize package globals
$var1 = "";
%hashit = ();

# then the others (which are still accessible as $some::Module::stuff)
$stuff = "";
@more = ();

# all file scoped lexicals must be created before the function below use them

# file private lexicals go here
my $priv_var = '';
my %secret_hash = ();

# here's a private function as a closure callable as &$priv_func; 
my $prev_func = sub {};

What is a constant subroutine?

A constant subroutine is one prototyped to take no arguments and to return a constant expression. Constant subroutine is subjected to optimization at compile time.

What is the purpose of the BEGIN subroutine?

A BEGIN subroutine is executed as soon as possible (the moment that it is completely defined). You may have multiple BEGIN blocks and they will be executed in a forward order. Once a BEGIN block is run, it is immediately become undefined. This means that you cannot explicitly call a BEGIN. An END block is executed as late as possible. You may have multiple END blocks and they will be executed in reverse order.

How does Perl handle the use statement?

Because the use statement implies a BEGIN block, the importation semantics happens at the use statement is compiled (before the rest of the file is compiled).

How does Perl handle module name that are all lower-cased?

Module name in all lower case are pragma (compiler directives). Most pragmatic modules are lexically scoped, so an inner block can contain any of these by saying:

no integer;
no strict 'refs';

which lasts until the end of that block.

The use vars and use subs are not block scope. They allow you to pre-declare a variable of subroutines within a file rather than just a block, you cannot rescind them with no vars or no subs.

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License