[ Index ] |
PHP Cross Reference of Unnamed Project |
[Summary view] [Print] [Text view]
1 ############################################################################# 2 # Pod/Parser.pm -- package which defines a base class for parsing POD docs. 3 # 4 # Copyright (C) 1996-2000 by Bradford Appleton. All rights reserved. 5 # This file is part of "PodParser". PodParser is free software; 6 # you can redistribute it and/or modify it under the same terms 7 # as Perl itself. 8 ############################################################################# 9 10 package Pod::Parser; 11 12 use vars qw($VERSION); 13 $VERSION = 1.35; ## Current version of this package 14 require 5.005; ## requires this Perl version or later 15 16 ############################################################################# 17 18 =head1 NAME 19 20 Pod::Parser - base class for creating POD filters and translators 21 22 =head1 SYNOPSIS 23 24 use Pod::Parser; 25 26 package MyParser; 27 @ISA = qw(Pod::Parser); 28 29 sub command { 30 my ($parser, $command, $paragraph, $line_num) = @_; 31 ## Interpret the command and its text; sample actions might be: 32 if ($command eq 'head1') { ... } 33 elsif ($command eq 'head2') { ... } 34 ## ... other commands and their actions 35 my $out_fh = $parser->output_handle(); 36 my $expansion = $parser->interpolate($paragraph, $line_num); 37 print $out_fh $expansion; 38 } 39 40 sub verbatim { 41 my ($parser, $paragraph, $line_num) = @_; 42 ## Format verbatim paragraph; sample actions might be: 43 my $out_fh = $parser->output_handle(); 44 print $out_fh $paragraph; 45 } 46 47 sub textblock { 48 my ($parser, $paragraph, $line_num) = @_; 49 ## Translate/Format this block of text; sample actions might be: 50 my $out_fh = $parser->output_handle(); 51 my $expansion = $parser->interpolate($paragraph, $line_num); 52 print $out_fh $expansion; 53 } 54 55 sub interior_sequence { 56 my ($parser, $seq_command, $seq_argument) = @_; 57 ## Expand an interior sequence; sample actions might be: 58 return "*$seq_argument*" if ($seq_command eq 'B'); 59 return "`$seq_argument'" if ($seq_command eq 'C'); 60 return "_$seq_argument}_'" if ($seq_command eq 'I'); 61 ## ... other sequence commands and their resulting text 62 } 63 64 package main; 65 66 ## Create a parser object and have it parse file whose name was 67 ## given on the command-line (use STDIN if no files were given). 68 $parser = new MyParser(); 69 $parser->parse_from_filehandle(\*STDIN) if (@ARGV == 0); 70 for (@ARGV) { $parser->parse_from_file($_); } 71 72 =head1 REQUIRES 73 74 perl5.005, Pod::InputObjects, Exporter, Symbol, Carp 75 76 =head1 EXPORTS 77 78 Nothing. 79 80 =head1 DESCRIPTION 81 82 B<Pod::Parser> is a base class for creating POD filters and translators. 83 It handles most of the effort involved with parsing the POD sections 84 from an input stream, leaving subclasses free to be concerned only with 85 performing the actual translation of text. 86 87 B<Pod::Parser> parses PODs, and makes method calls to handle the various 88 components of the POD. Subclasses of B<Pod::Parser> override these methods 89 to translate the POD into whatever output format they desire. 90 91 =head1 QUICK OVERVIEW 92 93 To create a POD filter for translating POD documentation into some other 94 format, you create a subclass of B<Pod::Parser> which typically overrides 95 just the base class implementation for the following methods: 96 97 =over 2 98 99 =item * 100 101 B<command()> 102 103 =item * 104 105 B<verbatim()> 106 107 =item * 108 109 B<textblock()> 110 111 =item * 112 113 B<interior_sequence()> 114 115 =back 116 117 You may also want to override the B<begin_input()> and B<end_input()> 118 methods for your subclass (to perform any needed per-file and/or 119 per-document initialization or cleanup). 120 121 If you need to perform any preprocesssing of input before it is parsed 122 you may want to override one or more of B<preprocess_line()> and/or 123 B<preprocess_paragraph()>. 124 125 Sometimes it may be necessary to make more than one pass over the input 126 files. If this is the case you have several options. You can make the 127 first pass using B<Pod::Parser> and override your methods to store the 128 intermediate results in memory somewhere for the B<end_pod()> method to 129 process. You could use B<Pod::Parser> for several passes with an 130 appropriate state variable to control the operation for each pass. If 131 your input source can't be reset to start at the beginning, you can 132 store it in some other structure as a string or an array and have that 133 structure implement a B<getline()> method (which is all that 134 B<parse_from_filehandle()> uses to read input). 135 136 Feel free to add any member data fields you need to keep track of things 137 like current font, indentation, horizontal or vertical position, or 138 whatever else you like. Be sure to read L<"PRIVATE METHODS AND DATA"> 139 to avoid name collisions. 140 141 For the most part, the B<Pod::Parser> base class should be able to 142 do most of the input parsing for you and leave you free to worry about 143 how to interpret the commands and translate the result. 144 145 Note that all we have described here in this quick overview is the 146 simplest most straightforward use of B<Pod::Parser> to do stream-based 147 parsing. It is also possible to use the B<Pod::Parser::parse_text> function 148 to do more sophisticated tree-based parsing. See L<"TREE-BASED PARSING">. 149 150 =head1 PARSING OPTIONS 151 152 A I<parse-option> is simply a named option of B<Pod::Parser> with a 153 value that corresponds to a certain specified behavior. These various 154 behaviors of B<Pod::Parser> may be enabled/disabled by setting 155 or unsetting one or more I<parse-options> using the B<parseopts()> method. 156 The set of currently accepted parse-options is as follows: 157 158 =over 3 159 160 =item B<-want_nonPODs> (default: unset) 161 162 Normally (by default) B<Pod::Parser> will only provide access to 163 the POD sections of the input. Input paragraphs that are not part 164 of the POD-format documentation are not made available to the caller 165 (not even using B<preprocess_paragraph()>). Setting this option to a 166 non-empty, non-zero value will allow B<preprocess_paragraph()> to see 167 non-POD sections of the input as well as POD sections. The B<cutting()> 168 method can be used to determine if the corresponding paragraph is a POD 169 paragraph, or some other input paragraph. 170 171 =item B<-process_cut_cmd> (default: unset) 172 173 Normally (by default) B<Pod::Parser> handles the C<=cut> POD directive 174 by itself and does not pass it on to the caller for processing. Setting 175 this option to a non-empty, non-zero value will cause B<Pod::Parser> to 176 pass the C<=cut> directive to the caller just like any other POD command 177 (and hence it may be processed by the B<command()> method). 178 179 B<Pod::Parser> will still interpret the C<=cut> directive to mean that 180 "cutting mode" has been (re)entered, but the caller will get a chance 181 to capture the actual C<=cut> paragraph itself for whatever purpose 182 it desires. 183 184 =item B<-warnings> (default: unset) 185 186 Normally (by default) B<Pod::Parser> recognizes a bare minimum of 187 pod syntax errors and warnings and issues diagnostic messages 188 for errors, but not for warnings. (Use B<Pod::Checker> to do more 189 thorough checking of POD syntax.) Setting this option to a non-empty, 190 non-zero value will cause B<Pod::Parser> to issue diagnostics for 191 the few warnings it recognizes as well as the errors. 192 193 =back 194 195 Please see L<"parseopts()"> for a complete description of the interface 196 for the setting and unsetting of parse-options. 197 198 =cut 199 200 ############################################################################# 201 202 use vars qw(@ISA); 203 use strict; 204 #use diagnostics; 205 use Pod::InputObjects; 206 use Carp; 207 use Exporter; 208 BEGIN { 209 if ($] < 5.6) { 210 require Symbol; 211 import Symbol; 212 } 213 } 214 @ISA = qw(Exporter); 215 216 ## These "variables" are used as local "glob aliases" for performance 217 use vars qw(%myData %myOpts @input_stack); 218 219 ############################################################################# 220 221 =head1 RECOMMENDED SUBROUTINE/METHOD OVERRIDES 222 223 B<Pod::Parser> provides several methods which most subclasses will probably 224 want to override. These methods are as follows: 225 226 =cut 227 228 ##--------------------------------------------------------------------------- 229 230 =head1 B<command()> 231 232 $parser->command($cmd,$text,$line_num,$pod_para); 233 234 This method should be overridden by subclasses to take the appropriate 235 action when a POD command paragraph (denoted by a line beginning with 236 "=") is encountered. When such a POD directive is seen in the input, 237 this method is called and is passed: 238 239 =over 3 240 241 =item C<$cmd> 242 243 the name of the command for this POD paragraph 244 245 =item C<$text> 246 247 the paragraph text for the given POD paragraph command. 248 249 =item C<$line_num> 250 251 the line-number of the beginning of the paragraph 252 253 =item C<$pod_para> 254 255 a reference to a C<Pod::Paragraph> object which contains further 256 information about the paragraph command (see L<Pod::InputObjects> 257 for details). 258 259 =back 260 261 B<Note> that this method I<is> called for C<=pod> paragraphs. 262 263 The base class implementation of this method simply treats the raw POD 264 command as normal block of paragraph text (invoking the B<textblock()> 265 method with the command paragraph). 266 267 =cut 268 269 sub command { 270 my ($self, $cmd, $text, $line_num, $pod_para) = @_; 271 ## Just treat this like a textblock 272 $self->textblock($pod_para->raw_text(), $line_num, $pod_para); 273 } 274 275 ##--------------------------------------------------------------------------- 276 277 =head1 B<verbatim()> 278 279 $parser->verbatim($text,$line_num,$pod_para); 280 281 This method may be overridden by subclasses to take the appropriate 282 action when a block of verbatim text is encountered. It is passed the 283 following parameters: 284 285 =over 3 286 287 =item C<$text> 288 289 the block of text for the verbatim paragraph 290 291 =item C<$line_num> 292 293 the line-number of the beginning of the paragraph 294 295 =item C<$pod_para> 296 297 a reference to a C<Pod::Paragraph> object which contains further 298 information about the paragraph (see L<Pod::InputObjects> 299 for details). 300 301 =back 302 303 The base class implementation of this method simply prints the textblock 304 (unmodified) to the output filehandle. 305 306 =cut 307 308 sub verbatim { 309 my ($self, $text, $line_num, $pod_para) = @_; 310 my $out_fh = $self->{_OUTPUT}; 311 print $out_fh $text; 312 } 313 314 ##--------------------------------------------------------------------------- 315 316 =head1 B<textblock()> 317 318 $parser->textblock($text,$line_num,$pod_para); 319 320 This method may be overridden by subclasses to take the appropriate 321 action when a normal block of POD text is encountered (although the base 322 class method will usually do what you want). It is passed the following 323 parameters: 324 325 =over 3 326 327 =item C<$text> 328 329 the block of text for the a POD paragraph 330 331 =item C<$line_num> 332 333 the line-number of the beginning of the paragraph 334 335 =item C<$pod_para> 336 337 a reference to a C<Pod::Paragraph> object which contains further 338 information about the paragraph (see L<Pod::InputObjects> 339 for details). 340 341 =back 342 343 In order to process interior sequences, subclasses implementations of 344 this method will probably want to invoke either B<interpolate()> or 345 B<parse_text()>, passing it the text block C<$text>, and the corresponding 346 line number in C<$line_num>, and then perform any desired processing upon 347 the returned result. 348 349 The base class implementation of this method simply prints the text block 350 as it occurred in the input stream). 351 352 =cut 353 354 sub textblock { 355 my ($self, $text, $line_num, $pod_para) = @_; 356 my $out_fh = $self->{_OUTPUT}; 357 print $out_fh $self->interpolate($text, $line_num); 358 } 359 360 ##--------------------------------------------------------------------------- 361 362 =head1 B<interior_sequence()> 363 364 $parser->interior_sequence($seq_cmd,$seq_arg,$pod_seq); 365 366 This method should be overridden by subclasses to take the appropriate 367 action when an interior sequence is encountered. An interior sequence is 368 an embedded command within a block of text which appears as a command 369 name (usually a single uppercase character) followed immediately by a 370 string of text which is enclosed in angle brackets. This method is 371 passed the sequence command C<$seq_cmd> and the corresponding text 372 C<$seq_arg>. It is invoked by the B<interpolate()> method for each interior 373 sequence that occurs in the string that it is passed. It should return 374 the desired text string to be used in place of the interior sequence. 375 The C<$pod_seq> argument is a reference to a C<Pod::InteriorSequence> 376 object which contains further information about the interior sequence. 377 Please see L<Pod::InputObjects> for details if you need to access this 378 additional information. 379 380 Subclass implementations of this method may wish to invoke the 381 B<nested()> method of C<$pod_seq> to see if it is nested inside 382 some other interior-sequence (and if so, which kind). 383 384 The base class implementation of the B<interior_sequence()> method 385 simply returns the raw text of the interior sequence (as it occurred 386 in the input) to the caller. 387 388 =cut 389 390 sub interior_sequence { 391 my ($self, $seq_cmd, $seq_arg, $pod_seq) = @_; 392 ## Just return the raw text of the interior sequence 393 return $pod_seq->raw_text(); 394 } 395 396 ############################################################################# 397 398 =head1 OPTIONAL SUBROUTINE/METHOD OVERRIDES 399 400 B<Pod::Parser> provides several methods which subclasses may want to override 401 to perform any special pre/post-processing. These methods do I<not> have to 402 be overridden, but it may be useful for subclasses to take advantage of them. 403 404 =cut 405 406 ##--------------------------------------------------------------------------- 407 408 =head1 B<new()> 409 410 my $parser = Pod::Parser->new(); 411 412 This is the constructor for B<Pod::Parser> and its subclasses. You 413 I<do not> need to override this method! It is capable of constructing 414 subclass objects as well as base class objects, provided you use 415 any of the following constructor invocation styles: 416 417 my $parser1 = MyParser->new(); 418 my $parser2 = new MyParser(); 419 my $parser3 = $parser2->new(); 420 421 where C<MyParser> is some subclass of B<Pod::Parser>. 422 423 Using the syntax C<MyParser::new()> to invoke the constructor is I<not> 424 recommended, but if you insist on being able to do this, then the 425 subclass I<will> need to override the B<new()> constructor method. If 426 you do override the constructor, you I<must> be sure to invoke the 427 B<initialize()> method of the newly blessed object. 428 429 Using any of the above invocations, the first argument to the 430 constructor is always the corresponding package name (or object 431 reference). No other arguments are required, but if desired, an 432 associative array (or hash-table) my be passed to the B<new()> 433 constructor, as in: 434 435 my $parser1 = MyParser->new( MYDATA => $value1, MOREDATA => $value2 ); 436 my $parser2 = new MyParser( -myflag => 1 ); 437 438 All arguments passed to the B<new()> constructor will be treated as 439 key/value pairs in a hash-table. The newly constructed object will be 440 initialized by copying the contents of the given hash-table (which may 441 have been empty). The B<new()> constructor for this class and all of its 442 subclasses returns a blessed reference to the initialized object (hash-table). 443 444 =cut 445 446 sub new { 447 ## Determine if we were called via an object-ref or a classname 448 my $this = shift; 449 my $class = ref($this) || $this; 450 ## Any remaining arguments are treated as initial values for the 451 ## hash that is used to represent this object. 452 my %params = @_; 453 my $self = { %params }; 454 ## Bless ourselves into the desired class and perform any initialization 455 bless $self, $class; 456 $self->initialize(); 457 return $self; 458 } 459 460 ##--------------------------------------------------------------------------- 461 462 =head1 B<initialize()> 463 464 $parser->initialize(); 465 466 This method performs any necessary object initialization. It takes no 467 arguments (other than the object instance of course, which is typically 468 copied to a local variable named C<$self>). If subclasses override this 469 method then they I<must> be sure to invoke C<$self-E<gt>SUPER::initialize()>. 470 471 =cut 472 473 sub initialize { 474 #my $self = shift; 475 #return; 476 } 477 478 ##--------------------------------------------------------------------------- 479 480 =head1 B<begin_pod()> 481 482 $parser->begin_pod(); 483 484 This method is invoked at the beginning of processing for each POD 485 document that is encountered in the input. Subclasses should override 486 this method to perform any per-document initialization. 487 488 =cut 489 490 sub begin_pod { 491 #my $self = shift; 492 #return; 493 } 494 495 ##--------------------------------------------------------------------------- 496 497 =head1 B<begin_input()> 498 499 $parser->begin_input(); 500 501 This method is invoked by B<parse_from_filehandle()> immediately I<before> 502 processing input from a filehandle. The base class implementation does 503 nothing, however, subclasses may override it to perform any per-file 504 initializations. 505 506 Note that if multiple files are parsed for a single POD document 507 (perhaps the result of some future C<=include> directive) this method 508 is invoked for every file that is parsed. If you wish to perform certain 509 initializations once per document, then you should use B<begin_pod()>. 510 511 =cut 512 513 sub begin_input { 514 #my $self = shift; 515 #return; 516 } 517 518 ##--------------------------------------------------------------------------- 519 520 =head1 B<end_input()> 521 522 $parser->end_input(); 523 524 This method is invoked by B<parse_from_filehandle()> immediately I<after> 525 processing input from a filehandle. The base class implementation does 526 nothing, however, subclasses may override it to perform any per-file 527 cleanup actions. 528 529 Please note that if multiple files are parsed for a single POD document 530 (perhaps the result of some kind of C<=include> directive) this method 531 is invoked for every file that is parsed. If you wish to perform certain 532 cleanup actions once per document, then you should use B<end_pod()>. 533 534 =cut 535 536 sub end_input { 537 #my $self = shift; 538 #return; 539 } 540 541 ##--------------------------------------------------------------------------- 542 543 =head1 B<end_pod()> 544 545 $parser->end_pod(); 546 547 This method is invoked at the end of processing for each POD document 548 that is encountered in the input. Subclasses should override this method 549 to perform any per-document finalization. 550 551 =cut 552 553 sub end_pod { 554 #my $self = shift; 555 #return; 556 } 557 558 ##--------------------------------------------------------------------------- 559 560 =head1 B<preprocess_line()> 561 562 $textline = $parser->preprocess_line($text, $line_num); 563 564 This method should be overridden by subclasses that wish to perform 565 any kind of preprocessing for each I<line> of input (I<before> it has 566 been determined whether or not it is part of a POD paragraph). The 567 parameter C<$text> is the input line; and the parameter C<$line_num> is 568 the line number of the corresponding text line. 569 570 The value returned should correspond to the new text to use in its 571 place. If the empty string or an undefined value is returned then no 572 further processing will be performed for this line. 573 574 Please note that the B<preprocess_line()> method is invoked I<before> 575 the B<preprocess_paragraph()> method. After all (possibly preprocessed) 576 lines in a paragraph have been assembled together and it has been 577 determined that the paragraph is part of the POD documentation from one 578 of the selected sections, then B<preprocess_paragraph()> is invoked. 579 580 The base class implementation of this method returns the given text. 581 582 =cut 583 584 sub preprocess_line { 585 my ($self, $text, $line_num) = @_; 586 return $text; 587 } 588 589 ##--------------------------------------------------------------------------- 590 591 =head1 B<preprocess_paragraph()> 592 593 $textblock = $parser->preprocess_paragraph($text, $line_num); 594 595 This method should be overridden by subclasses that wish to perform any 596 kind of preprocessing for each block (paragraph) of POD documentation 597 that appears in the input stream. The parameter C<$text> is the POD 598 paragraph from the input file; and the parameter C<$line_num> is the 599 line number for the beginning of the corresponding paragraph. 600 601 The value returned should correspond to the new text to use in its 602 place If the empty string is returned or an undefined value is 603 returned, then the given C<$text> is ignored (not processed). 604 605 This method is invoked after gathering up all the lines in a paragraph 606 and after determining the cutting state of the paragraph, 607 but before trying to further parse or interpret them. After 608 B<preprocess_paragraph()> returns, the current cutting state (which 609 is returned by C<$self-E<gt>cutting()>) is examined. If it evaluates 610 to true then input text (including the given C<$text>) is cut (not 611 processed) until the next POD directive is encountered. 612 613 Please note that the B<preprocess_line()> method is invoked I<before> 614 the B<preprocess_paragraph()> method. After all (possibly preprocessed) 615 lines in a paragraph have been assembled together and either it has been 616 determined that the paragraph is part of the POD documentation from one 617 of the selected sections or the C<-want_nonPODs> option is true, 618 then B<preprocess_paragraph()> is invoked. 619 620 The base class implementation of this method returns the given text. 621 622 =cut 623 624 sub preprocess_paragraph { 625 my ($self, $text, $line_num) = @_; 626 return $text; 627 } 628 629 ############################################################################# 630 631 =head1 METHODS FOR PARSING AND PROCESSING 632 633 B<Pod::Parser> provides several methods to process input text. These 634 methods typically won't need to be overridden (and in some cases they 635 can't be overridden), but subclasses may want to invoke them to exploit 636 their functionality. 637 638 =cut 639 640 ##--------------------------------------------------------------------------- 641 642 =head1 B<parse_text()> 643 644 $ptree1 = $parser->parse_text($text, $line_num); 645 $ptree2 = $parser->parse_text({%opts}, $text, $line_num); 646 $ptree3 = $parser->parse_text(\%opts, $text, $line_num); 647 648 This method is useful if you need to perform your own interpolation 649 of interior sequences and can't rely upon B<interpolate> to expand 650 them in simple bottom-up order. 651 652 The parameter C<$text> is a string or block of text to be parsed 653 for interior sequences; and the parameter C<$line_num> is the 654 line number corresponding to the beginning of C<$text>. 655 656 B<parse_text()> will parse the given text into a parse-tree of "nodes." 657 and interior-sequences. Each "node" in the parse tree is either a 658 text-string, or a B<Pod::InteriorSequence>. The result returned is a 659 parse-tree of type B<Pod::ParseTree>. Please see L<Pod::InputObjects> 660 for more information about B<Pod::InteriorSequence> and B<Pod::ParseTree>. 661 662 If desired, an optional hash-ref may be specified as the first argument 663 to customize certain aspects of the parse-tree that is created and 664 returned. The set of recognized option keywords are: 665 666 =over 3 667 668 =item B<-expand_seq> =E<gt> I<code-ref>|I<method-name> 669 670 Normally, the parse-tree returned by B<parse_text()> will contain an 671 unexpanded C<Pod::InteriorSequence> object for each interior-sequence 672 encountered. Specifying B<-expand_seq> tells B<parse_text()> to "expand" 673 every interior-sequence it sees by invoking the referenced function 674 (or named method of the parser object) and using the return value as the 675 expanded result. 676 677 If a subroutine reference was given, it is invoked as: 678 679 &$code_ref( $parser, $sequence ) 680 681 and if a method-name was given, it is invoked as: 682 683 $parser->method_name( $sequence ) 684 685 where C<$parser> is a reference to the parser object, and C<$sequence> 686 is a reference to the interior-sequence object. 687 [I<NOTE>: If the B<interior_sequence()> method is specified, then it is 688 invoked according to the interface specified in L<"interior_sequence()">]. 689 690 =item B<-expand_text> =E<gt> I<code-ref>|I<method-name> 691 692 Normally, the parse-tree returned by B<parse_text()> will contain a 693 text-string for each contiguous sequence of characters outside of an 694 interior-sequence. Specifying B<-expand_text> tells B<parse_text()> to 695 "preprocess" every such text-string it sees by invoking the referenced 696 function (or named method of the parser object) and using the return value 697 as the preprocessed (or "expanded") result. [Note that if the result is 698 an interior-sequence, then it will I<not> be expanded as specified by the 699 B<-expand_seq> option; Any such recursive expansion needs to be handled by 700 the specified callback routine.] 701 702 If a subroutine reference was given, it is invoked as: 703 704 &$code_ref( $parser, $text, $ptree_node ) 705 706 and if a method-name was given, it is invoked as: 707 708 $parser->method_name( $text, $ptree_node ) 709 710 where C<$parser> is a reference to the parser object, C<$text> is the 711 text-string encountered, and C<$ptree_node> is a reference to the current 712 node in the parse-tree (usually an interior-sequence object or else the 713 top-level node of the parse-tree). 714 715 =item B<-expand_ptree> =E<gt> I<code-ref>|I<method-name> 716 717 Rather than returning a C<Pod::ParseTree>, pass the parse-tree as an 718 argument to the referenced subroutine (or named method of the parser 719 object) and return the result instead of the parse-tree object. 720 721 If a subroutine reference was given, it is invoked as: 722 723 &$code_ref( $parser, $ptree ) 724 725 and if a method-name was given, it is invoked as: 726 727 $parser->method_name( $ptree ) 728 729 where C<$parser> is a reference to the parser object, and C<$ptree> 730 is a reference to the parse-tree object. 731 732 =back 733 734 =cut 735 736 sub parse_text { 737 my $self = shift; 738 local $_ = ''; 739 740 ## Get options and set any defaults 741 my %opts = (ref $_[0]) ? %{ shift() } : (); 742 my $expand_seq = $opts{'-expand_seq'} || undef; 743 my $expand_text = $opts{'-expand_text'} || undef; 744 my $expand_ptree = $opts{'-expand_ptree'} || undef; 745 746 my $text = shift; 747 my $line = shift; 748 my $file = $self->input_file(); 749 my $cmd = ""; 750 751 ## Convert method calls into closures, for our convenience 752 my $xseq_sub = $expand_seq; 753 my $xtext_sub = $expand_text; 754 my $xptree_sub = $expand_ptree; 755 if (defined $expand_seq and $expand_seq eq 'interior_sequence') { 756 ## If 'interior_sequence' is the method to use, we have to pass 757 ## more than just the sequence object, we also need to pass the 758 ## sequence name and text. 759 $xseq_sub = sub { 760 my ($self, $iseq) = @_; 761 my $args = join("", $iseq->parse_tree->children); 762 return $self->interior_sequence($iseq->name, $args, $iseq); 763 }; 764 } 765 ref $xseq_sub or $xseq_sub = sub { shift()->$expand_seq(@_) }; 766 ref $xtext_sub or $xtext_sub = sub { shift()->$expand_text(@_) }; 767 ref $xptree_sub or $xptree_sub = sub { shift()->$expand_ptree(@_) }; 768 769 ## Keep track of the "current" interior sequence, and maintain a stack 770 ## of "in progress" sequences. 771 ## 772 ## NOTE that we push our own "accumulator" at the very beginning of the 773 ## stack. It's really a parse-tree, not a sequence; but it implements 774 ## the methods we need so we can use it to gather-up all the sequences 775 ## and strings we parse. Thus, by the end of our parsing, it should be 776 ## the only thing left on our stack and all we have to do is return it! 777 ## 778 my $seq = Pod::ParseTree->new(); 779 my @seq_stack = ($seq); 780 my ($ldelim, $rdelim) = ('', ''); 781 782 ## Iterate over all sequence starts text (NOTE: split with 783 ## capturing parens keeps the delimiters) 784 $_ = $text; 785 my @tokens = split /([A-Z]<(?:<+\s)?)/; 786 while ( @tokens ) { 787 $_ = shift @tokens; 788 ## Look for the beginning of a sequence 789 if ( /^([A-Z])(<(?:<+\s)?)$/ ) { 790 ## Push a new sequence onto the stack of those "in-progress" 791 my $ldelim_orig; 792 ($cmd, $ldelim_orig) = ($1, $2); 793 ($ldelim = $ldelim_orig) =~ s/\s+$//; 794 ($rdelim = $ldelim) =~ tr/</>/; 795 $seq = Pod::InteriorSequence->new( 796 -name => $cmd, 797 -ldelim => $ldelim_orig, -rdelim => $rdelim, 798 -file => $file, -line => $line 799 ); 800 (@seq_stack > 1) and $seq->nested($seq_stack[-1]); 801 push @seq_stack, $seq; 802 } 803 ## Look for sequence ending 804 elsif ( @seq_stack > 1 ) { 805 ## Make sure we match the right kind of closing delimiter 806 my ($seq_end, $post_seq) = ("", ""); 807 if ( ($ldelim eq '<' and /\A(.*?)(>)/s) 808 or /\A(.*?)(\s+$rdelim)/s ) 809 { 810 ## Found end-of-sequence, capture the interior and the 811 ## closing the delimiter, and put the rest back on the 812 ## token-list 813 $post_seq = substr($_, length($1) + length($2)); 814 ($_, $seq_end) = ($1, $2); 815 (length $post_seq) and unshift @tokens, $post_seq; 816 } 817 if (length) { 818 ## In the middle of a sequence, append this text to it, and 819 ## dont forget to "expand" it if that's what the caller wanted 820 $seq->append($expand_text ? &$xtext_sub($self,$_,$seq) : $_); 821 $_ .= $seq_end; 822 } 823 if (length $seq_end) { 824 ## End of current sequence, record terminating delimiter 825 $seq->rdelim($seq_end); 826 ## Pop it off the stack of "in progress" sequences 827 pop @seq_stack; 828 ## Append result to its parent in current parse tree 829 $seq_stack[-1]->append($expand_seq ? &$xseq_sub($self,$seq) 830 : $seq); 831 ## Remember the current cmd-name and left-delimiter 832 if(@seq_stack > 1) { 833 $cmd = $seq_stack[-1]->name; 834 $ldelim = $seq_stack[-1]->ldelim; 835 $rdelim = $seq_stack[-1]->rdelim; 836 } else { 837 $cmd = $ldelim = $rdelim = ''; 838 } 839 } 840 } 841 elsif (length) { 842 ## In the middle of a sequence, append this text to it, and 843 ## dont forget to "expand" it if that's what the caller wanted 844 $seq->append($expand_text ? &$xtext_sub($self,$_,$seq) : $_); 845 } 846 ## Keep track of line count 847 $line += s/\r*\n//; 848 ## Remember the "current" sequence 849 $seq = $seq_stack[-1]; 850 } 851 852 ## Handle unterminated sequences 853 my $errorsub = (@seq_stack > 1) ? $self->errorsub() : undef; 854 while (@seq_stack > 1) { 855 ($cmd, $file, $line) = ($seq->name, $seq->file_line); 856 $ldelim = $seq->ldelim; 857 ($rdelim = $ldelim) =~ tr/</>/; 858 $rdelim =~ s/^(\S+)(\s*)$/$2$1/; 859 pop @seq_stack; 860 my $errmsg = "*** ERROR: unterminated $cmd}$ldelim}...$rdelim}". 861 " at line $line in file $file\n"; 862 (ref $errorsub) and &{$errorsub}($errmsg) 863 or (defined $errorsub) and $self->$errorsub($errmsg) 864 or warn($errmsg); 865 $seq_stack[-1]->append($expand_seq ? &$xseq_sub($self,$seq) : $seq); 866 $seq = $seq_stack[-1]; 867 } 868 869 ## Return the resulting parse-tree 870 my $ptree = (pop @seq_stack)->parse_tree; 871 return $expand_ptree ? &$xptree_sub($self, $ptree) : $ptree; 872 } 873 874 ##--------------------------------------------------------------------------- 875 876 =head1 B<interpolate()> 877 878 $textblock = $parser->interpolate($text, $line_num); 879 880 This method translates all text (including any embedded interior sequences) 881 in the given text string C<$text> and returns the interpolated result. The 882 parameter C<$line_num> is the line number corresponding to the beginning 883 of C<$text>. 884 885 B<interpolate()> merely invokes a private method to recursively expand 886 nested interior sequences in bottom-up order (innermost sequences are 887 expanded first). If there is a need to expand nested sequences in 888 some alternate order, use B<parse_text> instead. 889 890 =cut 891 892 sub interpolate { 893 my($self, $text, $line_num) = @_; 894 my %parse_opts = ( -expand_seq => 'interior_sequence' ); 895 my $ptree = $self->parse_text( \%parse_opts, $text, $line_num ); 896 return join "", $ptree->children(); 897 } 898 899 ##--------------------------------------------------------------------------- 900 901 =begin __PRIVATE__ 902 903 =head1 B<parse_paragraph()> 904 905 $parser->parse_paragraph($text, $line_num); 906 907 This method takes the text of a POD paragraph to be processed, along 908 with its corresponding line number, and invokes the appropriate method 909 (one of B<command()>, B<verbatim()>, or B<textblock()>). 910 911 For performance reasons, this method is invoked directly without any 912 dynamic lookup; Hence subclasses may I<not> override it! 913 914 =end __PRIVATE__ 915 916 =cut 917 918 sub parse_paragraph { 919 my ($self, $text, $line_num) = @_; 920 local *myData = $self; ## alias to avoid deref-ing overhead 921 local *myOpts = ($myData{_PARSEOPTS} ||= {}); ## get parse-options 922 local $_; 923 924 ## See if we want to preprocess nonPOD paragraphs as well as POD ones. 925 my $wantNonPods = $myOpts{'-want_nonPODs'}; 926 927 ## Update cutting status 928 $myData{_CUTTING} = 0 if $text =~ /^={1,2}\S/; 929 930 ## Perform any desired preprocessing if we wanted it this early 931 $wantNonPods and $text = $self->preprocess_paragraph($text, $line_num); 932 933 ## Ignore up until next POD directive if we are cutting 934 return if $myData{_CUTTING}; 935 936 ## Now we know this is block of text in a POD section! 937 938 ##----------------------------------------------------------------- 939 ## This is a hook (hack ;-) for Pod::Select to do its thing without 940 ## having to override methods, but also without Pod::Parser assuming 941 ## $self is an instance of Pod::Select (if the _SELECTED_SECTIONS 942 ## field exists then we assume there is an is_selected() method for 943 ## us to invoke (calling $self->can('is_selected') could verify this 944 ## but that is more overhead than I want to incur) 945 ##----------------------------------------------------------------- 946 947 ## Ignore this block if it isnt in one of the selected sections 948 if (exists $myData{_SELECTED_SECTIONS}) { 949 $self->is_selected($text) or return ($myData{_CUTTING} = 1); 950 } 951 952 ## If we havent already, perform any desired preprocessing and 953 ## then re-check the "cutting" state 954 unless ($wantNonPods) { 955 $text = $self->preprocess_paragraph($text, $line_num); 956 return 1 unless ((defined $text) and (length $text)); 957 return 1 if ($myData{_CUTTING}); 958 } 959 960 ## Look for one of the three types of paragraphs 961 my ($pfx, $cmd, $arg, $sep) = ('', '', '', ''); 962 my $pod_para = undef; 963 if ($text =~ /^(={1,2})(?=\S)/) { 964 ## Looks like a command paragraph. Capture the command prefix used 965 ## ("=" or "=="), as well as the command-name, its paragraph text, 966 ## and whatever sequence of characters was used to separate them 967 $pfx = $1; 968 $_ = substr($text, length $pfx); 969 ($cmd, $sep, $text) = split /(\s+)/, $_, 2; 970 ## If this is a "cut" directive then we dont need to do anything 971 ## except return to "cutting" mode. 972 if ($cmd eq 'cut') { 973 $myData{_CUTTING} = 1; 974 return unless $myOpts{'-process_cut_cmd'}; 975 } 976 } 977 ## Save the attributes indicating how the command was specified. 978 $pod_para = new Pod::Paragraph( 979 -name => $cmd, 980 -text => $text, 981 -prefix => $pfx, 982 -separator => $sep, 983 -file => $myData{_INFILE}, 984 -line => $line_num 985 ); 986 # ## Invoke appropriate callbacks 987 # if (exists $myData{_CALLBACKS}) { 988 # ## Look through the callback list, invoke callbacks, 989 # ## then see if we need to do the default actions 990 # ## (invoke_callbacks will return true if we do). 991 # return 1 unless $self->invoke_callbacks($cmd, $text, $line_num, $pod_para); 992 # } 993 if (length $cmd) { 994 ## A command paragraph 995 $self->command($cmd, $text, $line_num, $pod_para); 996 } 997 elsif ($text =~ /^\s+/) { 998 ## Indented text - must be a verbatim paragraph 999 $self->verbatim($text, $line_num, $pod_para); 1000 } 1001 else { 1002 ## Looks like an ordinary block of text 1003 $self->textblock($text, $line_num, $pod_para); 1004 } 1005 return 1; 1006 } 1007 1008 ##--------------------------------------------------------------------------- 1009 1010 =head1 B<parse_from_filehandle()> 1011 1012 $parser->parse_from_filehandle($in_fh,$out_fh); 1013 1014 This method takes an input filehandle (which is assumed to already be 1015 opened for reading) and reads the entire input stream looking for blocks 1016 (paragraphs) of POD documentation to be processed. If no first argument 1017 is given the default input filehandle C<STDIN> is used. 1018 1019 The C<$in_fh> parameter may be any object that provides a B<getline()> 1020 method to retrieve a single line of input text (hence, an appropriate 1021 wrapper object could be used to parse PODs from a single string or an 1022 array of strings). 1023 1024 Using C<$in_fh-E<gt>getline()>, input is read line-by-line and assembled 1025 into paragraphs or "blocks" (which are separated by lines containing 1026 nothing but whitespace). For each block of POD documentation 1027 encountered it will invoke a method to parse the given paragraph. 1028 1029 If a second argument is given then it should correspond to a filehandle where 1030 output should be sent (otherwise the default output filehandle is 1031 C<STDOUT> if no output filehandle is currently in use). 1032 1033 B<NOTE:> For performance reasons, this method caches the input stream at 1034 the top of the stack in a local variable. Any attempts by clients to 1035 change the stack contents during processing when in the midst executing 1036 of this method I<will not affect> the input stream used by the current 1037 invocation of this method. 1038 1039 This method does I<not> usually need to be overridden by subclasses. 1040 1041 =cut 1042 1043 sub parse_from_filehandle { 1044 my $self = shift; 1045 my %opts = (ref $_[0] eq 'HASH') ? %{ shift() } : (); 1046 my ($in_fh, $out_fh) = @_; 1047 $in_fh = \*STDIN unless ($in_fh); 1048 local *myData = $self; ## alias to avoid deref-ing overhead 1049 local *myOpts = ($myData{_PARSEOPTS} ||= {}); ## get parse-options 1050 local $_; 1051 1052 ## Put this stream at the top of the stack and do beginning-of-input 1053 ## processing. NOTE that $in_fh might be reset during this process. 1054 my $topstream = $self->_push_input_stream($in_fh, $out_fh); 1055 (exists $opts{-cutting}) and $self->cutting( $opts{-cutting} ); 1056 1057 ## Initialize line/paragraph 1058 my ($textline, $paragraph) = ('', ''); 1059 my ($nlines, $plines) = (0, 0); 1060 1061 ## Use <$fh> instead of $fh->getline where possible (for speed) 1062 $_ = ref $in_fh; 1063 my $tied_fh = (/^(?:GLOB|FileHandle|IO::\w+)$/ or tied $in_fh); 1064 1065 ## Read paragraphs line-by-line 1066 while (defined ($textline = $tied_fh ? <$in_fh> : $in_fh->getline)) { 1067 $textline = $self->preprocess_line($textline, ++$nlines); 1068 next unless ((defined $textline) && (length $textline)); 1069 1070 if ((! length $paragraph) && ($textline =~ /^==/)) { 1071 ## '==' denotes a one-line command paragraph 1072 $paragraph = $textline; 1073 $plines = 1; 1074 $textline = ''; 1075 } else { 1076 ## Append this line to the current paragraph 1077 $paragraph .= $textline; 1078 ++$plines; 1079 } 1080 1081 ## See if this line is blank and ends the current paragraph. 1082 ## If it isnt, then keep iterating until it is. 1083 next unless (($textline =~ /^([^\S\r\n]*)[\r\n]*$/) 1084 && (length $paragraph)); 1085 1086 ## Issue a warning about any non-empty blank lines 1087 if (length($1) > 0 and $myOpts{'-warnings'} and ! $myData{_CUTTING}) { 1088 my $errorsub = $self->errorsub(); 1089 my $file = $self->input_file(); 1090 my $errmsg = "*** WARNING: line containing nothing but whitespace". 1091 " in paragraph at line $nlines in file $file\n"; 1092 (ref $errorsub) and &{$errorsub}($errmsg) 1093 or (defined $errorsub) and $self->$errorsub($errmsg) 1094 or warn($errmsg); 1095 } 1096 1097 ## Now process the paragraph 1098 parse_paragraph($self, $paragraph, ($nlines - $plines) + 1); 1099 $paragraph = ''; 1100 $plines = 0; 1101 } 1102 ## Dont forget about the last paragraph in the file 1103 if (length $paragraph) { 1104 parse_paragraph($self, $paragraph, ($nlines - $plines) + 1) 1105 } 1106 1107 ## Now pop the input stream off the top of the input stack. 1108 $self->_pop_input_stream(); 1109 } 1110 1111 ##--------------------------------------------------------------------------- 1112 1113 =head1 B<parse_from_file()> 1114 1115 $parser->parse_from_file($filename,$outfile); 1116 1117 This method takes a filename and does the following: 1118 1119 =over 2 1120 1121 =item * 1122 1123 opens the input and output files for reading 1124 (creating the appropriate filehandles) 1125 1126 =item * 1127 1128 invokes the B<parse_from_filehandle()> method passing it the 1129 corresponding input and output filehandles. 1130 1131 =item * 1132 1133 closes the input and output files. 1134 1135 =back 1136 1137 If the special input filename "-" or "<&STDIN" is given then the STDIN 1138 filehandle is used for input (and no open or close is performed). If no 1139 input filename is specified then "-" is implied. 1140 1141 If a second argument is given then it should be the name of the desired 1142 output file. If the special output filename "-" or ">&STDOUT" is given 1143 then the STDOUT filehandle is used for output (and no open or close is 1144 performed). If the special output filename ">&STDERR" is given then the 1145 STDERR filehandle is used for output (and no open or close is 1146 performed). If no output filehandle is currently in use and no output 1147 filename is specified, then "-" is implied. 1148 Alternatively, an L<IO::String> object is also accepted as an output 1149 file handle. 1150 1151 This method does I<not> usually need to be overridden by subclasses. 1152 1153 =cut 1154 1155 sub parse_from_file { 1156 my $self = shift; 1157 my %opts = (ref $_[0] eq 'HASH') ? %{ shift() } : (); 1158 my ($infile, $outfile) = @_; 1159 my ($in_fh, $out_fh); 1160 if ($] < 5.006) { 1161 ($in_fh, $out_fh) = (gensym(), gensym()); 1162 } 1163 my ($close_input, $close_output) = (0, 0); 1164 local *myData = $self; 1165 local *_; 1166 1167 ## Is $infile a filename or a (possibly implied) filehandle 1168 if (defined $infile && ref $infile) { 1169 if (ref($infile) =~ /^(SCALAR|ARRAY|HASH|CODE|REF)$/) { 1170 croak "Input from $1 reference not supported!\n"; 1171 } 1172 ## Must be a filehandle-ref (or else assume its a ref to an object 1173 ## that supports the common IO read operations). 1174 $myData{_INFILE} = ${$infile}; 1175 $in_fh = $infile; 1176 } 1177 elsif (!defined($infile) || !length($infile) || ($infile eq '-') 1178 || ($infile =~ /^<&(?:STDIN|0)$/i)) 1179 { 1180 ## Not a filename, just a string implying STDIN 1181 $infile ||= '-'; 1182 $myData{_INFILE} = "<standard input>"; 1183 $in_fh = \*STDIN; 1184 } 1185 else { 1186 ## We have a filename, open it for reading 1187 $myData{_INFILE} = $infile; 1188 open($in_fh, "< $infile") or 1189 croak "Can't open $infile for reading: $!\n"; 1190 $close_input = 1; 1191 } 1192 1193 ## NOTE: we need to be *very* careful when "defaulting" the output 1194 ## file. We only want to use a default if this is the beginning of 1195 ## the entire document (but *not* if this is an included file). We 1196 ## determine this by seeing if the input stream stack has been set-up 1197 ## already 1198 1199 ## Is $outfile a filename, a (possibly implied) filehandle, maybe a ref? 1200 if (ref $outfile) { 1201 ## we need to check for ref() first, as other checks involve reading 1202 if (ref($outfile) =~ /^(ARRAY|HASH|CODE)$/) { 1203 croak "Output to $1 reference not supported!\n"; 1204 } 1205 elsif (ref($outfile) eq 'SCALAR') { 1206 # # NOTE: IO::String isn't a part of the perl distribution, 1207 # # so probably we shouldn't support this case... 1208 # require IO::String; 1209 # $myData{_OUTFILE} = "$outfile"; 1210 # $out_fh = IO::String->new($outfile); 1211 croak "Output to SCALAR reference not supported!\n"; 1212 } 1213 else { 1214 ## Must be a filehandle-ref (or else assume its a ref to an 1215 ## object that supports the common IO write operations). 1216 $myData{_OUTFILE} = ${$outfile}; 1217 $out_fh = $outfile; 1218 } 1219 } 1220 elsif (!defined($outfile) || !length($outfile) || ($outfile eq '-') 1221 || ($outfile =~ /^>&?(?:STDOUT|1)$/i)) 1222 { 1223 if (defined $myData{_TOP_STREAM}) { 1224 $out_fh = $myData{_OUTPUT}; 1225 } 1226 else { 1227 ## Not a filename, just a string implying STDOUT 1228 $outfile ||= '-'; 1229 $myData{_OUTFILE} = "<standard output>"; 1230 $out_fh = \*STDOUT; 1231 } 1232 } 1233 elsif ($outfile =~ /^>&(STDERR|2)$/i) { 1234 ## Not a filename, just a string implying STDERR 1235 $myData{_OUTFILE} = "<standard error>"; 1236 $out_fh = \*STDERR; 1237 } 1238 else { 1239 ## We have a filename, open it for writing 1240 $myData{_OUTFILE} = $outfile; 1241 (-d $outfile) and croak "$outfile is a directory, not POD input!\n"; 1242 open($out_fh, "> $outfile") or 1243 croak "Can't open $outfile for writing: $!\n"; 1244 $close_output = 1; 1245 } 1246 1247 ## Whew! That was a lot of work to set up reasonably/robust behavior 1248 ## in the case of a non-filename for reading and writing. Now we just 1249 ## have to parse the input and close the handles when we're finished. 1250 $self->parse_from_filehandle(\%opts, $in_fh, $out_fh); 1251 1252 $close_input and 1253 close($in_fh) || croak "Can't close $infile after reading: $!\n"; 1254 $close_output and 1255 close($out_fh) || croak "Can't close $outfile after writing: $!\n"; 1256 } 1257 1258 ############################################################################# 1259 1260 =head1 ACCESSOR METHODS 1261 1262 Clients of B<Pod::Parser> should use the following methods to access 1263 instance data fields: 1264 1265 =cut 1266 1267 ##--------------------------------------------------------------------------- 1268 1269 =head1 B<errorsub()> 1270 1271 $parser->errorsub("method_name"); 1272 $parser->errorsub(\&warn_user); 1273 $parser->errorsub(sub { print STDERR, @_ }); 1274 1275 Specifies the method or subroutine to use when printing error messages 1276 about POD syntax. The supplied method/subroutine I<must> return TRUE upon 1277 successful printing of the message. If C<undef> is given, then the B<warn> 1278 builtin is used to issue error messages (this is the default behavior). 1279 1280 my $errorsub = $parser->errorsub() 1281 my $errmsg = "This is an error message!\n" 1282 (ref $errorsub) and &{$errorsub}($errmsg) 1283 or (defined $errorsub) and $parser->$errorsub($errmsg) 1284 or warn($errmsg); 1285 1286 Returns a method name, or else a reference to the user-supplied subroutine 1287 used to print error messages. Returns C<undef> if the B<warn> builtin 1288 is used to issue error messages (this is the default behavior). 1289 1290 =cut 1291 1292 sub errorsub { 1293 return (@_ > 1) ? ($_[0]->{_ERRORSUB} = $_[1]) : $_[0]->{_ERRORSUB}; 1294 } 1295 1296 ##--------------------------------------------------------------------------- 1297 1298 =head1 B<cutting()> 1299 1300 $boolean = $parser->cutting(); 1301 1302 Returns the current C<cutting> state: a boolean-valued scalar which 1303 evaluates to true if text from the input file is currently being "cut" 1304 (meaning it is I<not> considered part of the POD document). 1305 1306 $parser->cutting($boolean); 1307 1308 Sets the current C<cutting> state to the given value and returns the 1309 result. 1310 1311 =cut 1312 1313 sub cutting { 1314 return (@_ > 1) ? ($_[0]->{_CUTTING} = $_[1]) : $_[0]->{_CUTTING}; 1315 } 1316 1317 ##--------------------------------------------------------------------------- 1318 1319 ##--------------------------------------------------------------------------- 1320 1321 =head1 B<parseopts()> 1322 1323 When invoked with no additional arguments, B<parseopts> returns a hashtable 1324 of all the current parsing options. 1325 1326 ## See if we are parsing non-POD sections as well as POD ones 1327 my %opts = $parser->parseopts(); 1328 $opts{'-want_nonPODs}' and print "-want_nonPODs\n"; 1329 1330 When invoked using a single string, B<parseopts> treats the string as the 1331 name of a parse-option and returns its corresponding value if it exists 1332 (returns C<undef> if it doesn't). 1333 1334 ## Did we ask to see '=cut' paragraphs? 1335 my $want_cut = $parser->parseopts('-process_cut_cmd'); 1336 $want_cut and print "-process_cut_cmd\n"; 1337 1338 When invoked with multiple arguments, B<parseopts> treats them as 1339 key/value pairs and the specified parse-option names are set to the 1340 given values. Any unspecified parse-options are unaffected. 1341 1342 ## Set them back to the default 1343 $parser->parseopts(-warnings => 0); 1344 1345 When passed a single hash-ref, B<parseopts> uses that hash to completely 1346 reset the existing parse-options, all previous parse-option values 1347 are lost. 1348 1349 ## Reset all options to default 1350 $parser->parseopts( { } ); 1351 1352 See L<"PARSING OPTIONS"> for more information on the name and meaning of each 1353 parse-option currently recognized. 1354 1355 =cut 1356 1357 sub parseopts { 1358 local *myData = shift; 1359 local *myOpts = ($myData{_PARSEOPTS} ||= {}); 1360 return %myOpts if (@_ == 0); 1361 if (@_ == 1) { 1362 local $_ = shift; 1363 return ref($_) ? $myData{_PARSEOPTS} = $_ : $myOpts{$_}; 1364 } 1365 my @newOpts = (%myOpts, @_); 1366 $myData{_PARSEOPTS} = { @newOpts }; 1367 } 1368 1369 ##--------------------------------------------------------------------------- 1370 1371 =head1 B<output_file()> 1372 1373 $fname = $parser->output_file(); 1374 1375 Returns the name of the output file being written. 1376 1377 =cut 1378 1379 sub output_file { 1380 return $_[0]->{_OUTFILE}; 1381 } 1382 1383 ##--------------------------------------------------------------------------- 1384 1385 =head1 B<output_handle()> 1386 1387 $fhandle = $parser->output_handle(); 1388 1389 Returns the output filehandle object. 1390 1391 =cut 1392 1393 sub output_handle { 1394 return $_[0]->{_OUTPUT}; 1395 } 1396 1397 ##--------------------------------------------------------------------------- 1398 1399 =head1 B<input_file()> 1400 1401 $fname = $parser->input_file(); 1402 1403 Returns the name of the input file being read. 1404 1405 =cut 1406 1407 sub input_file { 1408 return $_[0]->{_INFILE}; 1409 } 1410 1411 ##--------------------------------------------------------------------------- 1412 1413 =head1 B<input_handle()> 1414 1415 $fhandle = $parser->input_handle(); 1416 1417 Returns the current input filehandle object. 1418 1419 =cut 1420 1421 sub input_handle { 1422 return $_[0]->{_INPUT}; 1423 } 1424 1425 ##--------------------------------------------------------------------------- 1426 1427 =begin __PRIVATE__ 1428 1429 =head1 B<input_streams()> 1430 1431 $listref = $parser->input_streams(); 1432 1433 Returns a reference to an array which corresponds to the stack of all 1434 the input streams that are currently in the middle of being parsed. 1435 1436 While parsing an input stream, it is possible to invoke 1437 B<parse_from_file()> or B<parse_from_filehandle()> to parse a new input 1438 stream and then return to parsing the previous input stream. Each input 1439 stream to be parsed is pushed onto the end of this input stack 1440 before any of its input is read. The input stream that is currently 1441 being parsed is always at the end (or top) of the input stack. When an 1442 input stream has been exhausted, it is popped off the end of the 1443 input stack. 1444 1445 Each element on this input stack is a reference to C<Pod::InputSource> 1446 object. Please see L<Pod::InputObjects> for more details. 1447 1448 This method might be invoked when printing diagnostic messages, for example, 1449 to obtain the name and line number of the all input files that are currently 1450 being processed. 1451 1452 =end __PRIVATE__ 1453 1454 =cut 1455 1456 sub input_streams { 1457 return $_[0]->{_INPUT_STREAMS}; 1458 } 1459 1460 ##--------------------------------------------------------------------------- 1461 1462 =begin __PRIVATE__ 1463 1464 =head1 B<top_stream()> 1465 1466 $hashref = $parser->top_stream(); 1467 1468 Returns a reference to the hash-table that represents the element 1469 that is currently at the top (end) of the input stream stack 1470 (see L<"input_streams()">). The return value will be the C<undef> 1471 if the input stack is empty. 1472 1473 This method might be used when printing diagnostic messages, for example, 1474 to obtain the name and line number of the current input file. 1475 1476 =end __PRIVATE__ 1477 1478 =cut 1479 1480 sub top_stream { 1481 return $_[0]->{_TOP_STREAM} || undef; 1482 } 1483 1484 ############################################################################# 1485 1486 =head1 PRIVATE METHODS AND DATA 1487 1488 B<Pod::Parser> makes use of several internal methods and data fields 1489 which clients should not need to see or use. For the sake of avoiding 1490 name collisions for client data and methods, these methods and fields 1491 are briefly discussed here. Determined hackers may obtain further 1492 information about them by reading the B<Pod::Parser> source code. 1493 1494 Private data fields are stored in the hash-object whose reference is 1495 returned by the B<new()> constructor for this class. The names of all 1496 private methods and data-fields used by B<Pod::Parser> begin with a 1497 prefix of "_" and match the regular expression C</^_\w+$/>. 1498 1499 =cut 1500 1501 ##--------------------------------------------------------------------------- 1502 1503 =begin _PRIVATE_ 1504 1505 =head1 B<_push_input_stream()> 1506 1507 $hashref = $parser->_push_input_stream($in_fh,$out_fh); 1508 1509 This method will push the given input stream on the input stack and 1510 perform any necessary beginning-of-document or beginning-of-file 1511 processing. The argument C<$in_fh> is the input stream filehandle to 1512 push, and C<$out_fh> is the corresponding output filehandle to use (if 1513 it is not given or is undefined, then the current output stream is used, 1514 which defaults to standard output if it doesnt exist yet). 1515 1516 The value returned will be reference to the hash-table that represents 1517 the new top of the input stream stack. I<Please Note> that it is 1518 possible for this method to use default values for the input and output 1519 file handles. If this happens, you will need to look at the C<INPUT> 1520 and C<OUTPUT> instance data members to determine their new values. 1521 1522 =end _PRIVATE_ 1523 1524 =cut 1525 1526 sub _push_input_stream { 1527 my ($self, $in_fh, $out_fh) = @_; 1528 local *myData = $self; 1529 1530 ## Initialize stuff for the entire document if this is *not* 1531 ## an included file. 1532 ## 1533 ## NOTE: we need to be *very* careful when "defaulting" the output 1534 ## filehandle. We only want to use a default value if this is the 1535 ## beginning of the entire document (but *not* if this is an included 1536 ## file). 1537 unless (defined $myData{_TOP_STREAM}) { 1538 $out_fh = \*STDOUT unless (defined $out_fh); 1539 $myData{_CUTTING} = 1; ## current "cutting" state 1540 $myData{_INPUT_STREAMS} = []; ## stack of all input streams 1541 } 1542 1543 ## Initialize input indicators 1544 $myData{_OUTFILE} = '(unknown)' unless (defined $myData{_OUTFILE}); 1545 $myData{_OUTPUT} = $out_fh if (defined $out_fh); 1546 $in_fh = \*STDIN unless (defined $in_fh); 1547 $myData{_INFILE} = '(unknown)' unless (defined $myData{_INFILE}); 1548 $myData{_INPUT} = $in_fh; 1549 my $input_top = $myData{_TOP_STREAM} 1550 = new Pod::InputSource( 1551 -name => $myData{_INFILE}, 1552 -handle => $in_fh, 1553 -was_cutting => $myData{_CUTTING} 1554 ); 1555 local *input_stack = $myData{_INPUT_STREAMS}; 1556 push(@input_stack, $input_top); 1557 1558 ## Perform beginning-of-document and/or beginning-of-input processing 1559 $self->begin_pod() if (@input_stack == 1); 1560 $self->begin_input(); 1561 1562 return $input_top; 1563 } 1564 1565 ##--------------------------------------------------------------------------- 1566 1567 =begin _PRIVATE_ 1568 1569 =head1 B<_pop_input_stream()> 1570 1571 $hashref = $parser->_pop_input_stream(); 1572 1573 This takes no arguments. It will perform any necessary end-of-file or 1574 end-of-document processing and then pop the current input stream from 1575 the top of the input stack. 1576 1577 The value returned will be reference to the hash-table that represents 1578 the new top of the input stream stack. 1579 1580 =end _PRIVATE_ 1581 1582 =cut 1583 1584 sub _pop_input_stream { 1585 my ($self) = @_; 1586 local *myData = $self; 1587 local *input_stack = $myData{_INPUT_STREAMS}; 1588 1589 ## Perform end-of-input and/or end-of-document processing 1590 $self->end_input() if (@input_stack > 0); 1591 $self->end_pod() if (@input_stack == 1); 1592 1593 ## Restore cutting state to whatever it was before we started 1594 ## parsing this file. 1595 my $old_top = pop(@input_stack); 1596 $myData{_CUTTING} = $old_top->was_cutting(); 1597 1598 ## Dont forget to reset the input indicators 1599 my $input_top = undef; 1600 if (@input_stack > 0) { 1601 $input_top = $myData{_TOP_STREAM} = $input_stack[-1]; 1602 $myData{_INFILE} = $input_top->name(); 1603 $myData{_INPUT} = $input_top->handle(); 1604 } else { 1605 delete $myData{_TOP_STREAM}; 1606 delete $myData{_INPUT_STREAMS}; 1607 } 1608 1609 return $input_top; 1610 } 1611 1612 ############################################################################# 1613 1614 =head1 TREE-BASED PARSING 1615 1616 If straightforward stream-based parsing wont meet your needs (as is 1617 likely the case for tasks such as translating PODs into structured 1618 markup languages like HTML and XML) then you may need to take the 1619 tree-based approach. Rather than doing everything in one pass and 1620 calling the B<interpolate()> method to expand sequences into text, it 1621 may be desirable to instead create a parse-tree using the B<parse_text()> 1622 method to return a tree-like structure which may contain an ordered 1623 list of children (each of which may be a text-string, or a similar 1624 tree-like structure). 1625 1626 Pay special attention to L<"METHODS FOR PARSING AND PROCESSING"> and 1627 to the objects described in L<Pod::InputObjects>. The former describes 1628 the gory details and parameters for how to customize and extend the 1629 parsing behavior of B<Pod::Parser>. B<Pod::InputObjects> provides 1630 several objects that may all be used interchangeably as parse-trees. The 1631 most obvious one is the B<Pod::ParseTree> object. It defines the basic 1632 interface and functionality that all things trying to be a POD parse-tree 1633 should do. A B<Pod::ParseTree> is defined such that each "node" may be a 1634 text-string, or a reference to another parse-tree. Each B<Pod::Paragraph> 1635 object and each B<Pod::InteriorSequence> object also supports the basic 1636 parse-tree interface. 1637 1638 The B<parse_text()> method takes a given paragraph of text, and 1639 returns a parse-tree that contains one or more children, each of which 1640 may be a text-string, or an InteriorSequence object. There are also 1641 callback-options that may be passed to B<parse_text()> to customize 1642 the way it expands or transforms interior-sequences, as well as the 1643 returned result. These callbacks can be used to create a parse-tree 1644 with custom-made objects (which may or may not support the parse-tree 1645 interface, depending on how you choose to do it). 1646 1647 If you wish to turn an entire POD document into a parse-tree, that process 1648 is fairly straightforward. The B<parse_text()> method is the key to doing 1649 this successfully. Every paragraph-callback (i.e. the polymorphic methods 1650 for B<command()>, B<verbatim()>, and B<textblock()> paragraphs) takes 1651 a B<Pod::Paragraph> object as an argument. Each paragraph object has a 1652 B<parse_tree()> method that can be used to get or set a corresponding 1653 parse-tree. So for each of those paragraph-callback methods, simply call 1654 B<parse_text()> with the options you desire, and then use the returned 1655 parse-tree to assign to the given paragraph object. 1656 1657 That gives you a parse-tree for each paragraph - so now all you need is 1658 an ordered list of paragraphs. You can maintain that yourself as a data 1659 element in the object/hash. The most straightforward way would be simply 1660 to use an array-ref, with the desired set of custom "options" for each 1661 invocation of B<parse_text>. Let's assume the desired option-set is 1662 given by the hash C<%options>. Then we might do something like the 1663 following: 1664 1665 package MyPodParserTree; 1666 1667 @ISA = qw( Pod::Parser ); 1668 1669 ... 1670 1671 sub begin_pod { 1672 my $self = shift; 1673 $self->{'-paragraphs'} = []; ## initialize paragraph list 1674 } 1675 1676 sub command { 1677 my ($parser, $command, $paragraph, $line_num, $pod_para) = @_; 1678 my $ptree = $parser->parse_text({%options}, $paragraph, ...); 1679 $pod_para->parse_tree( $ptree ); 1680 push @{ $self->{'-paragraphs'} }, $pod_para; 1681 } 1682 1683 sub verbatim { 1684 my ($parser, $paragraph, $line_num, $pod_para) = @_; 1685 push @{ $self->{'-paragraphs'} }, $pod_para; 1686 } 1687 1688 sub textblock { 1689 my ($parser, $paragraph, $line_num, $pod_para) = @_; 1690 my $ptree = $parser->parse_text({%options}, $paragraph, ...); 1691 $pod_para->parse_tree( $ptree ); 1692 push @{ $self->{'-paragraphs'} }, $pod_para; 1693 } 1694 1695 ... 1696 1697 package main; 1698 ... 1699 my $parser = new MyPodParserTree(...); 1700 $parser->parse_from_file(...); 1701 my $paragraphs_ref = $parser->{'-paragraphs'}; 1702 1703 Of course, in this module-author's humble opinion, I'd be more inclined to 1704 use the existing B<Pod::ParseTree> object than a simple array. That way 1705 everything in it, paragraphs and sequences, all respond to the same core 1706 interface for all parse-tree nodes. The result would look something like: 1707 1708 package MyPodParserTree2; 1709 1710 ... 1711 1712 sub begin_pod { 1713 my $self = shift; 1714 $self->{'-ptree'} = new Pod::ParseTree; ## initialize parse-tree 1715 } 1716 1717 sub parse_tree { 1718 ## convenience method to get/set the parse-tree for the entire POD 1719 (@_ > 1) and $_[0]->{'-ptree'} = $_[1]; 1720 return $_[0]->{'-ptree'}; 1721 } 1722 1723 sub command { 1724 my ($parser, $command, $paragraph, $line_num, $pod_para) = @_; 1725 my $ptree = $parser->parse_text({<<options>>}, $paragraph, ...); 1726 $pod_para->parse_tree( $ptree ); 1727 $parser->parse_tree()->append( $pod_para ); 1728 } 1729 1730 sub verbatim { 1731 my ($parser, $paragraph, $line_num, $pod_para) = @_; 1732 $parser->parse_tree()->append( $pod_para ); 1733 } 1734 1735 sub textblock { 1736 my ($parser, $paragraph, $line_num, $pod_para) = @_; 1737 my $ptree = $parser->parse_text({<<options>>}, $paragraph, ...); 1738 $pod_para->parse_tree( $ptree ); 1739 $parser->parse_tree()->append( $pod_para ); 1740 } 1741 1742 ... 1743 1744 package main; 1745 ... 1746 my $parser = new MyPodParserTree2(...); 1747 $parser->parse_from_file(...); 1748 my $ptree = $parser->parse_tree; 1749 ... 1750 1751 Now you have the entire POD document as one great big parse-tree. You 1752 can even use the B<-expand_seq> option to B<parse_text> to insert 1753 whole different kinds of objects. Just don't expect B<Pod::Parser> 1754 to know what to do with them after that. That will need to be in your 1755 code. Or, alternatively, you can insert any object you like so long as 1756 it conforms to the B<Pod::ParseTree> interface. 1757 1758 One could use this to create subclasses of B<Pod::Paragraphs> and 1759 B<Pod::InteriorSequences> for specific commands (or to create your own 1760 custom node-types in the parse-tree) and add some kind of B<emit()> 1761 method to each custom node/subclass object in the tree. Then all you'd 1762 need to do is recursively walk the tree in the desired order, processing 1763 the children (most likely from left to right) by formatting them if 1764 they are text-strings, or by calling their B<emit()> method if they 1765 are objects/references. 1766 1767 =head1 CAVEATS 1768 1769 Please note that POD has the notion of "paragraphs": this is something 1770 starting I<after> a blank (read: empty) line, with the single exception 1771 of the file start, which is also starting a paragraph. That means that 1772 especially a command (e.g. C<=head1>) I<must> be preceded with a blank 1773 line; C<__END__> is I<not> a blank line. 1774 1775 =head1 SEE ALSO 1776 1777 L<Pod::InputObjects>, L<Pod::Select> 1778 1779 B<Pod::InputObjects> defines POD input objects corresponding to 1780 command paragraphs, parse-trees, and interior-sequences. 1781 1782 B<Pod::Select> is a subclass of B<Pod::Parser> which provides the ability 1783 to selectively include and/or exclude sections of a POD document from being 1784 translated based upon the current heading, subheading, subsubheading, etc. 1785 1786 =for __PRIVATE__ 1787 B<Pod::Callbacks> is a subclass of B<Pod::Parser> which gives its users 1788 the ability the employ I<callback functions> instead of, or in addition 1789 to, overriding methods of the base class. 1790 1791 =for __PRIVATE__ 1792 B<Pod::Select> and B<Pod::Callbacks> do not override any 1793 methods nor do they define any new methods with the same name. Because 1794 of this, they may I<both> be used (in combination) as a base class of 1795 the same subclass in order to combine their functionality without 1796 causing any namespace clashes due to multiple inheritance. 1797 1798 =head1 AUTHOR 1799 1800 Please report bugs using L<http://rt.cpan.org>. 1801 1802 Brad Appleton E<lt>bradapp@enteract.comE<gt> 1803 1804 Based on code for B<Pod::Text> written by 1805 Tom Christiansen E<lt>tchrist@mox.perl.comE<gt> 1806 1807 =cut 1808 1809 1; 1810 # vim: ts=4 sw=4 et
title
Description
Body
title
Description
Body
title
Description
Body
title
Body
Generated: Tue Mar 17 22:47:18 2015 | Cross-referenced by PHPXref 0.7.1 |