Access all Stories in a Document using php

August 31st, 2010 by ohaeuser

This is just a simple working example to access all Text Frames in a document.

Of course you might want to do something with it so I did not stop just outputting this but added a lib to create an ODT file.
ODT is part of the Open Document Format family so you can open it with Openoffice and also with MS Word 2007 SP2+.

MS Office might tell you that the file is broken BUT open it and choose repair file and voila …

If you wonder about some oddities in the code … the last time I actually coded something more complex in php there were only 3 movies in the Starwars Universe ;-)
I also had to do some encoding magic so the german Umlaute came our right in the Office document.

This is of course ONLY an example. Far from production ready. Just a simple proof of concept. The odtPHP class is just a small example what you can do with the content.
Use your Imagination ;-)

In order to try this out you need IDMLlib, a working phh JavaBridge and here now just for an example the odtPHP Library.

So here is the code – have fun trying it out:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
require_once("http://localhost:8080/IDML/java/Java.inc");
require_once('/var/www/phpjavabridge/java/include/odt/library/odf.php');
 
$odf = new odf("template1.odt");
 
setVars('title', 'IDMLlib Example ODF Export');
$message = "This is just a simple example you can do on a rainy Sunday with some IDMLlib and php knowledge!";
$odf->setVars('message', $message);
 
$idml = new java("de.fhcon.idmllib.api.elements.Idml", $fullName);
$document = $idml->getDocument();
 
$docutil = new java("de.fhcon.idmllib.api.util.DocumentUtil");
 
$storyArray = java_values($storyList);
$storyNumber = count($storyArray)-1;
 
echo "Total Number of Stories: ".$storyNumber."";
 
// My ID always adds an extra Story that is empty.
unset($storyArray[$storyNumber]);
 
print "
 
<h2>Content of Textboxes</h2>
 
 
";
 
$article = $odf->setSegment('articles');
 
foreach ($storyArray as $k => $v) {
$counter = $k+1;
$story = $idml->getAbstractDocument()->getStoryById($v);
$content  = java("de.fhcon.idmllib.api.util.StoryUtil")->getContent($story);
if (!empty($content)) {
echo "This is the Content from the Textbox: ".$counter."
".$content;
 
}
 
// add this to the ODT template
$title = "Textbox with the StoryID ".$v;
 
// this might look weird but its working perfectly.
$odtcontent = iconv( "UTF-8", "ISO-8859-15", $content );
$article->title($title);
$article->text($odtcontent);
$article->merge();
 
}
 
$odf->mergeSegment($article);
 
$odf->saveToDisk("done.odt");
 
?>
 
Download all Texts in one ODF File (MS 2007/Openoffice 3): <a href="output/done.odt">IDML Done</a>

You might want to visit the ODT page so you know whats going on: http://www.odtphp.com/
Basically its quite simple I created a template.odt like it is shown in Tutorial 3. Uploaded it to my Server and than use that to create new ODT files.
I deleted some unnecessary code … My example features an upload and some more infos.
But this should give you a good start if you want to extract text box content from InDesign using php and our IDMLlib and do something with this.

Leave a Reply

You must be logged in to post a comment.