aboutsummaryrefslogtreecommitdiffstats
path: root/docs
diff options
context:
space:
mode:
authorRaphael Moll <ralf@android.com>2010-11-29 15:30:07 -0800
committerRaphael Moll <ralf@android.com>2010-11-30 12:14:24 -0800
commit28e1cc36ca032615d2a4c48738c1042a4992e321 (patch)
treebe66c37a658d02f5be14253fc70a4a35aa983b07 /docs
parentb281c42014e66e6b9ecef5b1911e51ded8d172cb (diff)
downloadsdk-28e1cc36ca032615d2a4c48738c1042a4992e321.zip
sdk-28e1cc36ca032615d2a4c48738c1042a4992e321.tar.gz
sdk-28e1cc36ca032615d2a4c48738c1042a4992e321.tar.bz2
Move WST doc to sdk/docs.
Change-Id: I53d7838b557da43a9194ddb13ec1bcd0f1119a36
Diffstat (limited to 'docs')
-rwxr-xr-xdocs/Notes_on_WST_StructuredDocument.txt183
1 files changed, 183 insertions, 0 deletions
diff --git a/docs/Notes_on_WST_StructuredDocument.txt b/docs/Notes_on_WST_StructuredDocument.txt
new file mode 100755
index 0000000..dcf124d
--- /dev/null
+++ b/docs/Notes_on_WST_StructuredDocument.txt
@@ -0,0 +1,183 @@
+Notes on WST StructuredDocument
+-------------------------------
+
+Created: 2010/11/26
+References: WST 3.1.x, Eclipse 3.5 Galileo
+
+To manipulate XML documents in refactorings, we sometimes use the WST/SEE
+"StructuredDocument" API. There isn't exactly a lot of documentation on
+this out there, so this is a short explanation of how it works, totally
+based on _empirical_ evidence. As such, it must be taken with a grain of salt.
+
+Examples of usage can be found in
+ sdk/eclipse/plugins/com.android.ide.eclipse.adt/src/com/android/ide/eclipse/adt/internal/refactorings/
+
+
+1- Get a document instance
+--------------------------
+
+To get a document from an existing IFile resource:
+
+ IModelManager modelMan = StructuredModelManager.getModelManager();
+ IStructuredDocument sdoc = modelMan.createStructuredDocumentFor(file);
+
+Note that the IStructuredDocument and all the associated interfaces we'll use
+below are all located in org.eclipse.wst.sse.core.internal.provisional,
+meaning they _might_ change later.
+
+Also note that this parses the content of the file on disk, not of a buffer
+with pending unsaved modifications opened in an editor.
+
+There is a counterpart for non-existent resources:
+
+ IModelManager.createNewStructuredDocumentFor(IFile)
+
+However our goal so far has been to _parse_ existing documents, find
+the place that we wanted to modify and then generate a TextFileChange
+for a refactoring operation. Consequently this document doesn't say
+anything about using this model to modify content directly.
+
+
+2- Structured Document overview
+-------------------------------
+
+The IStructuredDocument is organized in "regions", which are little pieces
+of text.
+
+The document contains a list of region collections, each one being
+a list of regions. Each region has a type, as well as text.
+
+Since we use this to parse XML, let's look at this XML example:
+
+<?xml version="1.0" encoding="utf-8"?> \n
+<resource> \n
+ <color/>
+ <string name="my_string">Some Value</string> <!-- comment -->\n
+</resource>
+
+
+This will result in the following regions and sub-regions:
+(all the constants below are located in DOMRegionContext)
+
+XML_PI_OPEN
+ XML_PI_OPEN:<?
+ XML_TAG_NAME:xml
+ XML_TAG_ATTRIBUTE_NAME:version
+ XML_TAG_ATTRIBUTE_EQUALS:=
+ XML_TAG_ATTRIBUTE_VALUE:"1.0"
+ XML_TAG_ATTRIBUTE_NAME:encoding
+ XML_TAG_ATTRIBUTE_EQUALS:=
+ XML_TAG_ATTRIBUTE_VALUE:"utf-8"
+ XML_PI_CLOSE:?>
+
+XML_CONTENT
+ XML_CONTENT:\n
+
+XML_TAG_NAME
+ XML_TAG_OPEN:<
+ XML_TAG_NAME:resources
+ XML_TAG_CLOSE:>
+
+XML_CONTENT
+ XML_CONTENT:\n + whitespace before color
+
+XML_TAG_NAME
+ XML_TAG_OPEN:<
+ XML_TAG_NAME:color
+ XML_EMPTY_TAG_CLOSE:/>
+
+XML_CONTENT
+ XML_CONTENT:\n + whitespace before string
+
+XML_TAG_NAME
+ XML_TAG_OPEN:<
+ XML_TAG_NAME:string
+ XML_TAG_ATTRIBUTE_NAME:name
+ XML_TAG_ATTRIBUTE_EQUALS:=
+ XML_TAG_ATTRIBUTE_VALUE:"my_string"
+ XML_TAG_CLOSE:>
+
+XML_CONTENT
+ XML_CONTENT:Some Value
+
+XML_TAG_NAME
+ XML_END_TAG_OPEN:</
+ XML_TAG_NAME:string
+ XML_TAG_CLOSE:>
+
+XML_CONTENT
+ XML_CONTENT: (2 spaces before the comment)
+
+XML_COMMENT_TEXT
+ XML_COMMENT_OPEN:<!--
+ XML_COMMENT_TEXT: comment
+ XML_COMMENT_CLOSE:--
+
+XML_CONTENT
+ XML_CONTENT: \n after comment
+
+XML_TAG_NAME
+ XML_END_TAG_OPEN:</
+ XML_TAG_NAME:resources
+ XML_TAG_CLOSE:>
+
+XML_CONTENT
+ XML_CONTENT:
+
+
+3- Iterating through regions
+----------------------------
+
+To iterate through all regions, we need to process the list of top-level regions and then
+iterate over inner regions:
+
+ for (IStructuredDocumentRegion regions : sdoc.getStructuredDocumentRegions()) {
+ // process inner regions
+ for (int i = 0; i < regions.getNumberOfRegions(); i++) {
+ ITextRegion region = regions.getRegions().get(i);
+ String type = region.getType();
+ String text = regions.getText(region);
+ }
+ }
+
+Each "region collection" basically matches one XML tag, with sub-regions for all the tokens
+inside a tag.
+
+Note that an XML_CONTENT region is actually the whitespace, was is known as a TEXT in the w3c DOM.
+
+Also note that each outer region has a type, but the inner regions also reuse a similar type.
+So for example an outer XML_TAG_NAME region collection is a proper XML tag, and it will contain
+an opening tag, a closing tag but also an XML_TAG_NAME that is the tag name itself.
+
+Surprisingly, the inner regions do not have many access methods we can use on them, except their
+type and start/length/end. There are two length and end methods:
+- getLength() and getEnd() take any whitespace into account.
+- getTextLength() and getTextEnd() exclude some typical trailing whitespace.
+
+Note that regarding the trailing whitespace, empirical evidence shows that in the XML case
+here, the only case where it matters is in a tag such as <string name="my_string">: for the
+XML_TAG_NAME region, getLength is 7 (string + space) and getTextLength is 6 (string, no space).
+Spacing between XML element is its own collapsed region.
+
+If you want the text of the inner region, you actually need to query it from the outer region.
+The outer IStructuredDocumentRegion (the region collection) contains lots more useful access
+methods, some of which return details on the inner regions:
+- getText : without the whitespace.
+- getFullText : with the whitespace.
+- getStart / getLength / getEnd : type-dependent offset, including whitespace.
+- getStart / getTextLength / getTextEnd : type-dependent offset, excluding "irrelevant" whitespace.
+- getStartOffset / getEndOffset / getTextEndOffset : relative to document.
+
+Empirical evidence shows that there is no discernible difference between the getStart/getEnd
+values and those returned by getStartOffset/getEndOffset. Please abide by the javadoc.
+
+All offsets start at zero.
+
+Given a region collection, you can also browse regions either using a getRegions() list, or
+using getFirst/getLastRegion, or using getRegionAtCharacterOffset(). Iterating the region
+list seems the most useful scenario. There's no actual iterator provided for inner regions.
+
+There are a few other methods available in the regions classes. This was not an exhaustive list.
+
+
+----